This is a special post for quick takes by Nathan Helm-Burger. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

New to LessWrong?

40 comments, sorted by Click to highlight new comments since: Today at 6:17 AM

https://youtu.be/Xd5PLYl4Q5Q?si=EQ7A0oOV78z7StX2

Cute demo of Claude, GPT4, and Gemini building stuff in Minecraft

I feel like I'd like the different categories of AI risk attentuation to be referred to as more clearly separate:

AI usability safety - would this gun be safe for a trained professional to use on a shooting range? Will it be reasonably accurate and not explode or backfire?

AI world-impact safety - would it be safe to give out one of these guns for 0.10$ to anyone who wanted one?

AI weird complicated usability safety - would this gun be safe to use if a crazy person tried to use a hundred of them plus a variety of other guns, to make an elaborate Rube Goldberg machine and fire it off with live ammo with no testing?

Like, I hear you, but that is...also not how they teach gun safety.  Like, if there is one fact you know about gun safety, it's that the entire field emphasizes that a gun is inherently dangerous towards anything it is pointed towards.

I mean, that is kinda what I'm trying to get at. I feel like any sufficiently powerful AI should be treated as a dangerous tool, like a gun. It should be used carefully and deliberately.

Instead we're just letting anyone do whatever with them. For now, nothing too bad has happened, but I feel confident that the danger is real and getting worse quickly as models improve.

Richard Cook, “How Complex Systems Fail (2000). “Complex systems run as broken systems”:

The system continues to function because it contains so many redundancies and because people can make it function, despite the presence of many flaws. After accident reviews nearly always note that the system has a history of prior ‘proto-accidents’ that nearly generated catastrophe. Arguments that these degraded conditions should have been recognized before the overt accident are usually predicated on naïve notions of system performance. System operations are dynamic, with components (organizational, human, technical) failing and being replaced continuously.

h/t jasoncrawford

"And there’s a world not so far from this one where I, too, get behind a pause. For example, one actual major human tragedy caused by a generative AI model might suffice to push me over the edge." - Scott Aaronson in https://scottaaronson.blog/?p=7174 

My take: I think there's a big chunk of the world, a lot of smart powerful people, who are in this camp right now. People waiting to see a real-world catastrophe before they update their worldviews. In the meantime, they are waiting and watching, feeling skeptical of implausible-sounding stories of potential risks.

[-]awg1y54

This stood out to me when reading his take as well. I wonder if this has something to do with a security-mindedness spectrum that people are on. Less security-minded people going "Sure, if it happens we'll do something. (But it will probably never happen.)" and the more security-minded people going "Let's try to prevent it from happening. (Because it totally could happen.)"

I guess it gets hard in cases like these where the stakes either way seem super high to both sides. I think that's why you get less security-minded people saying things like that, because they also rate the upside very highly, they don't want to sacrifice any of it if they don't have to.

Just my take (as a probably overly-security-minded person).

A couple of quotes on my mind these days....

 


https://www.lesswrong.com/posts/Z263n4TXJimKn6A8Z/three-worlds-decide-5-8 
"My lord," the Ship's Confessor said, "suppose the laws of physics in our universe had been such that the ancient Greeks could invent the equivalent of nuclear weapons from materials just lying around.  Imagine the laws of physics had permitted a way to destroy whole countries with no more difficulty than mixing gunpowder.  History would have looked quite different, would it not?"

Akon nodded, puzzled.  "Well, yes," Akon said.  "It would have been shorter."

"Aren't we lucky that physics _didn't_ happen to turn out that way, my lord?  That in our own time, the laws of physics _don't_ permit cheap, irresistable superweapons?"

Akon furrowed his brow -

"But my lord," said the Ship's Confessor, "do we really know what we _think_ we know?  What _different_ evidence would we see, if things were otherwise?  After all - if _you_ happened to be a physicist, and _you_ happened to notice an easy way to wreak enormous destruction using off-the-shelf hardware - would _you_ run out and tell you?"

 

https://www.lesswrong.com/posts/sKRts4bY7Fo9fXnmQ/a-conversation-about-progress-and-safety 
LUCA: ... But if the wrong person gets their hands on it, or if it’s a super-decentralized technology where anybody can do anything and the offense/defense balance isn’t clear, then you can really screw things up. I think that’s why it becomes a harder issue. It becomes even harder when these technologies are super general purpose, which makes them really difficult to stop or not get distributed or embedded. If you think of all the potential upsides you could have from AI, but also all the potential downsides you could have if just one person uses it for a really bad thing—that seems really difficult. ...

Damn, this suggests that all those people who said "human mind is magical; a machine cannot think because it wouldn't have a soul or quantum magic" were actually trying to protect us from the AI apocalypse. We were too stupid to understand, and too arrogant to defer to the wisdom of the crowd. And now we are doomed.

galaxy brain take XD

Anti-steganography idea for language models:

I think that steganography is potentially a problem with language models that are in some sort of model-to-model communication. For a simple and commonplace example, using the a one-token-prediction model multiple times to produce many tokens in a row. If a model with strategic foresight knows it is being used in this way, it potentially allows the model to pass hidden information to its future self via use of certain tokens vs other tokens.

Another scenario might be chains of similar models working together in a bureaucracy.

My idea for countering this type of steganography would be to generate many versions of the output with synonym substitution, and/or using language models fine-tuned to specifically reword paragraphs while preserving content/meaning. Similarly, if are worried about images containing steganography, but you have an image generator model (separate from the original image maker, that you believe to not be in collusion with the original model), you can have the second model make subtle variants of the original image which should be expected to lack the hidden info.

musical interlude A song about the end times: https://youtu.be/WVF3q5Y68-0

It is the possibility recombination and cross-labeling techniques like this which make me think we aren't likely to run into a data bottleneck even if models stay bad at data efficiency.

OmniDataComposer: A Unified Data Structure for Multimodal Data Fusion and Infinite Data Generation

Authors: Dongyang Yu, Shihao Wang, Yuan Fang, Wangpeng An

Abstract: This paper presents OmniDataComposer, an innovative approach for multimodal data fusion and unlimited data generation with an intent to refine and uncomplicate interplay among diverse data modalities. Coming to the core breakthrough, it introduces a cohesive data structure proficient in processing and merging multimodal data inputs, which include video, audio, and text. Our crafted algorithm leverages advancements across multiple operations such as video/image caption extraction, dense caption extraction, Automatic Speech Recognition (ASR), Optical Character Recognition (OCR), Recognize Anything Model(RAM), and object tracking. OmniDataComposer is capable of identifying over 6400 categories of objects, substantially broadening the spectrum of visual information. It amalgamates these diverse modalities, promoting reciprocal enhancement among modalities and facilitating cross-modal data correction. \textbf{The final output metamorphoses each video input into an elaborate sequential document}, virtually transmuting videos into thorough narratives, making them easier to be processed by large language models. Future prospects include optimizing datasets for each modality to encourage unlimited data generation. This robust base will offer priceless insights to models like ChatGPT, enabling them to create higher quality datasets for video captioning and easing question-answering tasks based on video content. OmniDataComposer inaugurates a new stage in multimodal learning, imparting enormous potential for augmenting AI's understanding and generation of complex, real-world data.

cute silly invention idea: a robotic Chop (Chinese signature stamp) which stamps your human-readable public signature as well as a QR-type digital code. But the code would be both single use (so nobody could copy it, and fool you with the copy), and tied to your private key (so nobody but you could generate such a code). This would obviously be a much better way to sign documents, or artwork, or whatever. Maybe the single-use aspect would mean that the digital stamp recorded every stamp it produced in some compressed way on a private blockchain or something.

The difficult thing is tying the signature to the thing signed. Even if they are single-use, unless the relying party sees everything you ever sign immediately, such a signature can be transferred to something you didn't sign from something you signed that the relying party didn't see.

What if the signature contains a hash of the document it was created for, so that it will not match a different document if transferred?

I thought you wanted to sign physical things with this? How will you hash them? Otherwise, how is this different from a standard digital signature?

The idea is that the device would have a camera, and do ocr on the text, hash that, incorporate that into the stamp design somehow, then you'd stamp it

[-][anonymous]2mo40

Functionally you just end up back to a trusted 3rd party or blockchain. Basically the device you describe is just a handheld QR code printer. But anyone can just scan the code off the real document and print the same code on the fake one. So you end up needing the entire text of the document to be digital, and stored as a file or a hash of the file on blockchain/trusted 3rd party.

This requirement for a log of when something happened, recorded on an authoritative location, seems to me to be the general solution for the issue that generative ai can potentially fake anything.

So for example, if you wanted to establish that a person did a thing, a video of them doing the thing isn't enough. The video or hash of the video needs to be on a server or blockchain at the time the thing supposedly happened. And there needs to be further records such as street camera recordings that were also streamed to a trusted place that correlate to the person doing the thing.

The security mechanism here is that it may be possible to indistinguishably fake any video, but it is unlikely some random camera owned by an uninvolved third party would record a person going to do a thing at the time the thing happened, and you know the record probably isn't fake because of when it was saved and the disinterest of the camera owner in the case.

This of course only works until you can make robots indistinguishable from a person being faked. And it assumes a sophisticated group can't hack into cameras and inject fake images into the stream as part of a coordinated effort to frame someone....

Hmm, yes I see. Good point. In order to not be copy-able, it's not enough for the QR code to be single-use. Because if confronted with two documents with the same code, you would know one was false but not which one! So the device would need to scan and hash the document in order to derive a code unique to the document and also having the property of only could have been generated by someone with the private key, and confirm-able by the public key. This sounds like a job for... the CypherGoth!

[-][anonymous]2mo42

Scan, hash, and upload. Or otherwise when you go to court - your honor here is the document I signed, it says nothing about how this gym membership is eternal and hereditary - and the gym says "no here's the one you signed, see you initialed at each of the clauses..."

yes, at least upload to the your private list 'documents I have signed' (which could be local). That list would need to also be verifiable in some way by a 3rd party, such that they could match the entries to the corresponding stamps. The uploading wouldn't necessarily need to be instant though. The record could potentially be cached on the device and uploaded later, in case of lack of connectivity at time of use.

[-][anonymous]2mo20

Well I was thinking that the timing - basically Google or apple got the document within a short time after signing - is evidence it wasn't tampered with, or if it was, one of the parties was already intending fraud from the start.

Like say party A never uploads, and during discovery in a lawsuit 2 years later they present 1 version, while party Bs phone said it was taken with the camera (fakeable but requires a rooted phone) and has a different version. Nothing definitive distinguishes the 2 documents at a pixel level. (Because if there was a way to tell, the AI critic during the fabrication process for 1 or both documents would have noticed....)

So then B is more likely to be telling the truth and A should get sanctioned.

I have brought in another element here : trusted hardware chains. You need trusted hardware, with trusted software running on it, uploading the information at around the time an event happened.

If A's device scanned and hashed the document, and produced a signature stamp unique to that document, then B couldn't just copy A's signature onto a different document. The stamp and document wouldn't match. If B has a document that has a signature which only A could have produced and which matches the semantic content of the document, then A can't claim to not have signed the document. The signature is proof that A agreed to the terms as written. B couldn't be faking it. Even if A tampered with their own record to erase the signing-event, it would still be provable that only someone with A's private key could have produced the stamp and that the stamp matches the hash of the disputed document. Oh, and the stamp should include a UTC datetime as part of the hash. In case A's private key later gets compromised, B can't then use the compromised private key to fake A having signed something in the past.

[-][anonymous]2mo20

Seems legit. Note that hash functions are usually sensitive to 1 bit of error. Meaning if you optically scan every character, if 2 different devices map a letter to even a different font size there will be differences in a hash. (Unless you convert down to ASCII but even that can miss a character from ocr errors etc. Or interpreting white space as spaces vs tabs..)

Yeah, you'd need to also save a clear-text record in your personal 'signed documents' repository, which you could refer to, in case of such glitches.

AI Summer thought

A cool application of current level AI that I haven't seen implemented would be for a game which had the option to have your game avatar animated by AI. Have the AI be allowed to monitor your face through your webcam, and update the avatar's features in real time. PvP games (including board games like Go) are way more fun when you get to see your opponent's reactions to surprising moments.

The game Eco has the option to animate your avatar via webcam. Although I do own and play the game occasionally, I have no idea how good this feature is as I do not have a webcam.

Thinking about climate change solutions, and the neat Silver Lining project. I've been wondering about additional ways of getting sea water into the atmosphere over tropical ocean. What if you used the SpinLaunch system to hurl chunks of frozen seawater high into the atmosphere. Would the ice melt in time? Would you need a small explosive charge implanted in the ice block to vaporize it? How would such a system compare in terms of cost effectiveness and generated cloud cover? It seems like an easier way to get higher-elevation clouds.

This seems totally unworkable. The ice would have to withstand thousands of gs, and it would have no reason to melt or disperse into clouds. What's wrong with airplanes?

Airplanes are good. Just wondering if the ice launcher idea would be more cost effective. If you throw it hard enough, the air friction will melt it. If that's infeasible, then the explosive charge is still an option. As for whether the ice could hold up against the force involved in launching or if you'd need a disposable container, unclear. Seems like an interesting engineering question though.

Remember that cool project where Redwood made a simple web app to allow humans to challenge themselves against language models in predicting next tokens on web data? I'd love to see something similar done for the LLM arena, so we could compare the ELO scores of human users to the scores of LLMs.https://lmsys.org/blog/2023-05-03-arena/

A harmless and funny example of generating an image output which could theoretically be info hazardous to a sufficiently naive audience. https://www.reddit.com/gallery/1275ndl

I'm so much better at coming up with ideas for experiments than I am at actually coding and running the experiments. If there was a coding model that actually sped up my ability to run experiments, I'd make much faster progress.

Just a random thought. I was wondering if it was possible to make a better laser keyboard by having an actual physical keyboard consisting of a mat with highly reflective background and embossed letters & key borders. This would give at least some tactile feedback of touching a key. Also it would give the laser and sensors a consistent environment on which to do their detection allowing for more precise engineering. You could use an infrared laser since you wouldn't need its projection to make the keyboard visible, and you could use multiple emitters and detectors around the sides of the keyboard. The big downside remaining would be that you'd have to get used to hovering your fingers over the keyboard and having even the slightest touch count as a press. 

The big downside remaining would be that you’d have to get used to hovering your fingers over the keyboard and having even the slightest touch count as a press.

That's a pretty big downside, and, in my opinion, the reason that touch keyboards haven't really taken off for any kind of "long-form" writing. Even for devices that are ostensibly mostly oriented around touch UIs, such as smartphones and tablets, there is a large ecosystem of physical keyboard accessories which allow the user to rest their hands on keys and provide greater feedback for key presses than a mat.

AI-alignment-assistant-model tasks

Thinking about the sort of tasks current models seem good at, it seems like translation and interpolation / remixing seem like pretty solid areas. If I were to design an AI assistant to help with alignment research, I think I'd focus on questions of these sorts to start with.

Translation: take this ML interpretability paper on CNNs and make it work for Transformers instead

Interpolation: take these two (or more) ML interpretability papers and give me a technique that does something like a cross between them.

Important take-away from Steven Byrnes's Brain-like AGI series: the human reward/value system involves a dynamic blend of many different reward signals that decrease in strength the closer they get to being satisficed, and may even temporarily reverse in value if overfilled (e.g. hunger -> overeating in a single sitting). There is an inherent robustness to optimizing for many different competing goals at once. It seems like a system design we should explore more in future research.

I keep thinking about the idea of 'virtual neurons'. Functional units corresponding to natural abstractions made up of a subtle combination of weights & biases distributed throughout a neural network. I'd like to be able to 'sparsify' this set of virtual neurons. Project them out to the full sparse space of virtual neurons and somehow tease them apart from each other, then recombine the pieces again with new boundary lines drawn around the true abstractions. Not sure how to do this, but I keep circling back around to the idea. Maybe if the network could first be distilled down to 16 or 8-bit numbers without much capability loss? Then a sparse space could be the full representation of the integer conversion of every point of precision? (upcast number, multiply by the factor of ten corresponding to the smallest point of precision of the original low bit number, round off remaining decimal portion, convert to big integer). Then you could use that integer as an index into the sparse space with dimensionality equal to the full set of weights (factor the biases into the weights) of the model. Then look for concentrations in this huge space which correspond to abstractions in the world?   ... step 4. Profit?