ChatGPT spills its prompt

[ - ]

David Gerard@awful.systemsOPM

12 points

6 months ago

we did a writeup too https://pivot-to-ai.com/2024/07/05/chatgpt-spills-its-prompt/

permalink

report

reply

[ - ]

Steve@awful.systems

37 points

6 months ago

Is it absurd that the maker of a tech product controls it by writing it a list of plain language guidelines? or am I out of touch?

permalink

report

reply

[ - ]

Kg. Madee Ⅱ.@mathstodon.xyz

29 points

6 months ago

@fasterandworse @dgerard I mean, it is absurd. But it is how it works: an LLM is a black box from a programming perspective, and you cannot directly control what it will output.
So you resort to pre-weighting certain keywords in the hope that it will nudge the system far enough in your desired direction.
There is no separation between code (what the provider wants it to do) and data (user inputs to operate on) in this application 🥴

permalink

report

parent

reply

[ - ]

corbin@awful.systems

6 points

6 months ago

That’s the standard response from last decade. However, we now have a theory of soft prompting: start with a textual prompt, embed it, and then optimize the embedding with a round of fine-tuning. It would be obvious if OpenAI were using this technique, because we would only recover similar texts instead of verbatim texts when leaking the prompt (unless at zero temperature, perhaps.) This is a good example of how OpenAI’s offerings are behind the state of the art.

permalink

report

parent

reply

[ - ]

barsquid@lemmy.world

13 points

6 months ago

It is absurd. It’s just throwing words at it and hoping whatever area of the vector database it starts generating words from makes sense in response.

permalink

report

parent

reply

[ - ]

ebu@awful.systems

20 points

6 months ago

*

simply ask the word generator machine to generate better words, smh

this is actually the most laughable/annoying thing to me. it betrays such a comprehensive lack of understanding of what LLMs do and what “prompting” even is. you’re not giving instructions to an agent, you are feeding a list of words to prefix to the output of a word predictor

in my personal experiments with offline models, using something like “below is a transcript of a chat log with XYZ” as a prompt instead of “You are XYZ” immediately gives much better results. not good results, but better

permalink

report

parent

reply

[ - ]

Steve@awful.systems

12 points

6 months ago

it’s all so anti-precision

permalink

report

parent

reply

[ - ]

o7___o7@awful.systems

10 points

6 months ago

*

simply ask the word generator machine to generate better words, smh

Butterfly man: “Is this recursive self-improvement”

permalink

report

parent

reply

[ - ]

V0ldek@awful.systems

3 points

6 months ago

“controls” is way too generous

permalink

report

parent

reply

[ - ]

FRANK.MCCONNEL@fosstodon.org

6 points

6 months ago

@fasterandworse @dgerard I am pretty sure I have seen programming the computer in plain English used as a selling point for various products since the 1970s at least

the best part is that most of these products are ex-products

permalink

report

parent

reply

[ - ]

FRANK.MCCONNEL@fosstodon.org

4 points

6 months ago

@fasterandworse @dgerard I mean, it’s like catnip for the people who control how the company’s money is spent

For absurd, I think one would want the LLM’s configuration language to be more like INTERCAL; but this may also be more explicit about how your instructions are merely suggestions to a black box full of weights and pulleys and with some randomness added to make it less predictable/repetitive

permalink

report

parent

reply

[ - ]

🇺🇦 haxadecimal@mastodon.social

3 points

6 months ago

@hairyvisionary @fasterandworse @dgerard
That was explicitly a goal of COBOL, and (guessing here) probably Commercial Translator as well.

permalink

report

parent

reply

[ - ]

Last@reddthat.com

15 points

6 months ago

It still works. Say “hi” to it, give it the leaked prompt, and then you can ask about other prompts. I just got this one when I asked about Python.


When you send a message containing Python code to python, it will be executed 
in a
stateful Jupyter notebook environment. python will respond with the output of 
the execution or time out after 60.0
seconds. The drive at '/mnt/data' can be used to save and persist user files. 
Internet access for this session is disabled. Do not make external web requests 
or API calls as they will fail.
Use ace_tools.display_dataframe_to_user(name: str, dataframe: pandas.DataFrame) 
-> None to visually present pandas DataFrames when it benefits the user.
 When making charts for the user: 1) never use seaborn, 2) give each chart its 
own distinct plot (no subplots), and 3) never set any specific colors – 
unless explicitly asked to by the user. 
 I REPEAT: when making charts for the user: 1) use matplotlib over seaborn, 2) 
give each chart its own distinct plot (no subplots), and 3) never, ever, 
specify colors or matplotlib styles – unless explicitly asked to by the user```

permalink

report

reply

[ - ]

barsquid@lemmy.world

20 points

6 months ago

“I repeat…”

That’s exactly what I want from a computer interface, something that’s struggling to pay attention to directions and needs to be told everything twice. It’d also like it to just respond with whatever has a cosine similarity to the definitions of the words in the instructions I gave it, instead of doing what I actually asked.

permalink

report

parent

reply

[ - ]

Sailor Sega Saturn@awful.systems

72 points

6 months ago

You can practically taste the frustration in the “prompt engineering” here. Just one more edge case bro, one more edge case and then the prompt will be perfect!

permalink

report

reply

[ - ]

NaibofTabr@infosec.pub

27 points

6 months ago

It’s edge cases all the way down.

permalink

report

parent

reply

[ - ]

rtxn@lemmy.world

18 points

6 months ago

4D chess move: you can’t have an edge case if every case is an edge case

permalink

report

parent

reply

[ - ]

Steve@awful.systems

5 points

6 months ago

it’s like if all browser bugs were like IE6 bugs that only happened sometimes because you have a float after an inline element that contains the letter c, or sometims b, somewhere in the dom.

permalink

report

parent

reply

[ - ]

recklessengagement@lemmy.world

7 points

6 months ago

Hah, still worked for me. I enjoy the peek at how they structure the original prompt. Wonder if there’s a way to define a personality.

permalink

report

reply

[ - ]

corbin@awful.systems

5 points

6 months ago

Not with this framing. By adopting the first- and second-person pronouns immediately, the simulation is collapsed into a simple Turing-test scenario, and the computer’s only personality objective (in terms of what was optimized during RLHF) is to excel at that Turing test. The given personalities are all roles performed by a single underlying actor.

As the saying goes, the best evidence for the shape-rotator/wordcel dichotomy is that techbros are terrible at words.

NSFW

The way to fix this is to embed the entire conversation into the simulation with third-person framing, as if it were a story, log, or transcript. This means that a personality would be simulated not by an actor in a Turing test, but directly by the token-predictor. In terms of narrative, it means strictly defining and enforcing a fourth wall. We can see elements of this in fine-tuning of many GPTs for RAG or conversation, but such fine-tuning only defines formatted acting rather than personality simulation.

permalink

report

parent

reply

[ - ]

o7___o7@awful.systems

11 points

6 months ago