Content Policies

In ChatGPT-4o, if you have the right subscription plan, the text-based AI acts as your agent for the renderer, DALL·E. So you spend your time negotiating with the text-based AI before it submits the rendering job. OpenAI has some very vague “content policies” that are meant to prevent the rendering of “harmful” or “dangerous” or “sensitive” images. What these words mean is not spelled out, so you only learn that you’ve run afoul of them when your rendering job is suddenly aborted before it finishes. You then ask the text-AI what was objectionable, and it tries to figure out based on the partial image what triggered the censors. Some of these policies become clear right away. No nudity of any kind, and no hints of character nudity even if it cannot be seen. And usually no clothing, or proximity to a bed, or character placement that hints at potential “intimacy.” The text-AI usually apologizes, pointing out that the policies were not meant to exclude legitimate cinematic treatments of natural interactions between fictional characters, but they are overly coarse-grained and especially sensitive to photo-realistic scenes, fearing that such scenes exploit real people. An irony here is that DALL·E is so good at photo-realism that the AI filter can’t tell that these are not real people. Another key category is the appearance of children. When the children are rendered photo-realistically, the filter becomes hyper-sensitive to any scene that suggest the exploitation of children, or the placement of children in harm’s way.

The text-AI then suggest ways to “soften” or obscure the prompt so that you get an acceptable image. Photo-realism is part of the problem. If you let the renderer produce what they call “painterly” images, you can render almost anything because the image is clearly not real. This, of course, was not an option for me. Some scenes and scene sequences could not be sufficiently “softened” without destroying the narrative, so I had to give up on them. These are scenes you see everyday in PG-13 movies.

The irony here is that all of my images that ran afoul of the censors were first generated by DALL·E itself, using rendering prompts that the text-AI itself composed. The image filter doesn’t kick in (usually) until there is an actual image in the session context. So ChatGPT-4o can successfully generate many images that it will not allow as input. DALL·E makes many small errors, so the image wouldn’t be flagged until I fed it back into ChatGPT for correction. Oh, we can’t have that! So I managed to get some scenes generated (with errors) that weren’t allowed to be considered for error correction. The most absurd of these is the scene on the bridge with the children pointing down toward the river. DALL·E did its usual only-one-balustrade thing on the bridge. When I put it back in to get a second balustrade, the censors flagged it for putting children in danger too close to the edge of a bridge without a proper railing.

In general, the policies seem to reflect a typical American orientation toward what is too sensitive to depict in polite society. Violence is usually fine, but hints of natural sexuality of any variety make them queasy. Sometimes, though, an image will be abruptly censored in mid-rendering on the first attempt. You see incremental progress during the rendering process, so I saw some of what was being rendered before the axe fell. The most baffling of these is the one in the banner image above. In the novel text, Ilya and Katerinya engage in a long embrace at the railway station, in the midst of a conductor and other disembarking passengers. The scene has some emotional weight, so the text-AI read my novel text and outlined some pretty compelling directions for DALL·E on how to render the faces. It was abruptly stopped in mid-render by the censors. Note that these are two adults, fully clothed (very fully clothed – it’s winter!) engaging in an embrace. So the text-AI began a long series of re-prompts to try to “soften” the image to get it past the filter, though it admitted to being a bit baffled too. We removed the conductor, then all of the passengers (PDA?), then Katerinya was turned around so that her face is not visible, then Ilya’s eyes were closed. Finally it got through. I got to see a little bit of what was about to be rendered in the abruptly aborted attempts, and they all appeared to stop when the faces were being rendered. What I saw looked like the usual uncanny rendering of compelling emotional expressions that this AI is very good at.