Aarhus University Seal

AI and Visual Culture

Panel chair: Lotte Philipsen, lottephilipsen@cc.au.dk

This panel explores interrelations between artificial intelligence and contemporary visual culture. Today, AI image analysis and image generation flourish in several different domains (e.g., news imagery, transport, artistic practice, private photography, the culture industry, policing). The panel investigates how AI (possibly) transforms our conception of vision, our use of images, and image technologies’ use of us. While the distributed ecologies of AI images cut across traditional borders such as visible/invisible and human/machine, the panel especially welcomes papers that analyze concrete AI image practice(s) and discuss their possible cultural implications.

Mette-Marie Zacher Sørensen: Deepfake Animations of Dead People

In 2021 the Mexican Journalist Javier Valdez was shot. Afterwards a video was released of him saying: “I am not afraid because you cannot kill me twice”. In this paper I aim to analyse the ethics of deepfake animations. ‘Deepfakes’ is the popular word term for faked videos (synthetic audiovisual media productions) produced by using deep learning techniques. What interests me in particular in this context is the ethics of the implied subjects in an AI animation. Traditional Image Theory has seldom addressed the actual affect of depicted bodies (Berger, 1972), but along with the proliferation of selfies on social networks (where the sender is the subject) ((Tiidenberg and Gómez-Cruz, 2015), deepfakes bring to our attention, in a darker way, the way in which bodies in a moving image not only affect, but are also affected. 

Biography

Mette-Marie Zacher Sørensen is associate professor in Aesthetics and Culture at Aarhus University, Denmark.

Amanda Wasielewski: Unnatural Images: On AI-Generated “Photographs”

In artificial intelligence (AI) and computer vision research, photographic images are typically referred to as “natural” images. This means that images used for automated categorization and recognition tasks are conceptualized within a binary as either natural or synthetic. Recent advances in creative AI technology, particular generative adversarial networks (GANs) have afforded the ability to create photographic-seeming images, i.e., natural images, which are created based on learnings from vast databases of digital photographs. Contemporary discussions of these images, popularized in the media and on the website thispersondoesnotexist.com, have thus far revolved around the political and social implications of producing convincing “fake” photographs of people who do not exist. However, these images are of theoretical interest for the fields of art history and visual studies for additional reasons. The history and theory of photography has often centered on the relationship between photography and nature, its status within fine art, its indexical quality, its relationship to memory, and its documentary mode. GAN-created natural images both resonate with and oppose the formal readings of photography in these ways. This paper addresses these images from an art historical perspective and asks: can these images be considered photographs? If so, what are the implications for the field when photographic images are thus divorced from the mechanical process of lens, camera, and light hitting a reactive surface or sensor?

Biography 

Amanda Wasielewski is Docent in Art History at Stockholm University. She is currently part of the Metadata Culture project Sharing the Visual Heritage, focusing on the impact of digital tools in art historical scholarship and collections. Wasielewski is the author of three monographs: Made in Brooklyn: Artists, Hipsters, Makers, Gentrifiers (Zero, 2018), From City Space to Cyberspace: Art, Squatting, and Internet Culture in the Netherlands (Amsterdam University Press, 2021), and Computational Formalism: Art History and Machine Learning (MIT Press, forthcoming).

Asker Bryld Staunæs: Artificial Imagination in Grégory Chatonsky: AI image aesthetics, autonomies and possibilities

In this presentation, I will outline a strategy for a more consistent, imaginary and investigative art for AI. I will do this by interpreting Grégory Chatonsky’s project of “artificial imagination”, which he has since 2018 developed through AI images and short blog posts. I will characterise the AI art scene through two themes derived from Chatonsky: 1) that project’s for machinic autonomy recuperates artistic autonomy, and 2) that the aesthetic question of AI imagery lies in-between autonomy and possibility. 

As an artist-researcher, I find Chatonsky’s contribution to be important, because he has formulated a rare aesthetic critique on the development of AI images through a methodology that he calls recherche-création.For Chatonsky, new models such as DALL-E 2 are devolving the field into banality, boredom and kitsch due to a computer scientist aesthetic of ‘realism’ or ‘naturalism’. However, as this aesthetics is built on top of a condensed representation of recorded visual culture, there is a spectrum of possibility for re- assembling AI’s aesthetic and cultural values. Here, theory is swiftly converted into practice, as one can easily recognise how a specific aesthetics renders recorded visual culture, and as one is continuously in need of fine tuning for ascertaining whether images reveal inclinations of data (culture) or programming (aesthetics).

As AI image technologies are increasingly subject to a constant flux of methods, it becomes necessary to develop fundamental strategies. This is especially so when one compares Chatonsky to the contemporary movement of neuralism (e.g. Kogan’s Abraham, 2019, or Klingemann’s Botto, 2021) that seeks to “summon” an artificial autonomous artist (Rouviere 2017). Through Chatonsky’s critique, I will argue that these signify a generalized “artificial idiocy” (Bratton 2015) that can only “ejaculate onto the walls of the universe” in the manner of Alfred Jarry’s painting machine Clinamen from Dr. Faustroll (1911).

Artist-researcher: MindFuture.ai, Spanien19c

Daniel Chavez Heras: Computational Ekphrasis: On the Aesthetic Possibilities of Describing Images into Existence

In this paper I explore some of the most salient aesthetic possibilities enabled by recent large- scale computational models designed to link images and natural language, through the philosophical concept of ekphrasis

Some of the most significant developments in applied machine learning research come from general large multi-modal systems that can take text descriptions as inputs and produce matching images as outputs. Systems such as DALL·E 2 (Ramesh et al., 2022) , Image (Saharia et al., 2022) , and Flamingo (Alayrac et al., 2022) , all released within weeks of each other between April and May 2022. In the first part, I give a technical overview for non- technical people of the type of technology that underpins these systems, touching on the processes of tokenisation and vectorisation that allow multimodal learning, and the difussion mechanism used to generate images from text prompts. I also give an account of how these models are currently being used by a growing community of practice in creative domains. 

In the second part I turn to ekphrasis, originally a poetic device and literary genre in classic rhetoric, but whose broader meaning in contemporary aesthetics and philosophy of art refers to a type of description that appears to evoke, summon, and sometimes even exceed, that which is being described (for an introduction to this broader meaning in contemporary aesthetics see: Scott, 1991). One of the canonical examples in the literature is Homer’s rich description of Achilles’ shield in the Iliad; a vivid textual representation of a concrete and visible object, intended to make us “see” it in all its complexity with/through words (see: Becker, 1995; and for a reciprocal example see: Vail, 2018). 

Following WJT Mitchell’s (1994) three stages of ekphrasis ―indifference, fascination, and fear― I explore how the contemporary viewing subject is produced thorugh the web of interrelations afforded and constrained by computational ekphrasis; how by describing images into existence we can access visual culture as and when we paradoxically dislodge images from vision. 

Figure 1: Image generated by the author through a CLIP- guided difussion model with the prompt “a group of computational humanities scholars in the future” 

References 

Alayrac, J.-B. et al. (2022) Flamingo: a visual language model for few-shot learning. [online]. Available from: arxiv.org/abs/2204.14198 (Accessed 1 June 2022). 

Becker, A. S. (1995) The Shield of Achilles and the Poetics of Ekpharsis. Rowman & Littlefield Publishers. 

Mitchell, W. J. (1994) ‘Ekphrasis and the Other’, in Picture theory: Essays on verbal and visual representation. University of Chicago Press Chicago. pp. 151–181. 

Ramesh, A. et al. (2022) Hierarchical Text-Conditional Image Generation with CLIP Latents

. [online]. Available from: arxiv.org/abs/2204.06125 (Accessed 1 June 2022). 

Saharia, C. et al. (2022) Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding

. [online]. Available from: arxiv.org/abs/2205.11487 (Accessed 1 June 2022). 

Scott, G. F. (1991) The rhetoric of dilation: ekphrasis and ideology. Word & Image. 7 (4), 301–310. Vail, K. (2018) Reconstructing the Shield of Achilles. Los Angeles: The Writer’s Lifeline. 

Biography

Daniel is a technologist and humanities scholar specialised in the computational production and analysis of visual culture. His research combines critical frameworks in the history and theories of cinema, television, and photography, with advanced technical practice in creative and scientific computing. He has worked in industry and academia, and is an experienced international university educator. King’s College London.
 

Jamie Wallace: The Visual Culture of Facial Expression Analysis.

Despite the discredited status of physiognomy, facial expression analysis tracks the change and motion of facial features to assert the degree to which somebody is for example angry, disgusted, or surprised.  The creation of algorithms capable of "seeing" these static and changing visual features through measuring and processing complex sets of image data relies, not only upon computational operations but also on a growing number of visual techniques and digital practices. The images resulting from such techniques constitute a particular form of visuality or visual culture, able to perform and support convincing acts of correlation and interpretation that enable emotional states to be encoded within graphical and diagrammatic relations. Understanding the limitations and biases of machine vision technologies depends, in part, upon appreciating the implications and cultural predispositions of the human-machine gaze as used and reconfigured in data visualisation techniques and the visual culture of techno science more broadly. This proposal considers the manner the digital corporal face of facial expression analysis appears to be both captured and masked by scientific relations that become entwined in a struggling modality culturally conjoining human and the post human identities.

Jamie Wallace 

Associate Professor, PhD, MFA

The Danish School of Education, DPU

Aarhus University, Denmark

jw@edu.au.dk

Perle Møhl: Seeing ensembles – human and AI vision in border control and radiology

In this paper, I present prior and current research on the interaction between human senses and sensor technologies, more specifically between human seeing and two types of visual analyses performed by machine-learning technologies, notably in automated border control and radiological cancer diagnostics. The presentation explores how humans – border police and radiologists - and their co-operative AIs learn to see together, in order to detect anomalies and threats, whether to national borders or the scanned body. The focus is on the minute sensory interactions, i.e. how the AIs visual analyzes are formed and trained by ideas about human vision and the visual, and how the human operators’ seeing is formed by the sensor technologies and their specific forms of vision and particular decisional capacities. These seeing ensembles do not, however, operate in a human-machine void. A variety of forces of political, economic, organizational and material character also interact in different and often latent ways with the vision work taking place, and continually form the decision-making.

Perle Møhl, anthropologist, PhD, is specialized in visual anthropology and has worked with the interaction between human senses, particularly vision, and different visual and sensor technologies over the last 10 years, notably in border control, security scans and robotic surgery, and will from Autumn 2022 participate in a collaborative research project about public values and AI.

Lotte Philipsen: Boundary (black) boxes: The aesthetics of ‘Google Arts & Culture’ AI image methods

This paper explores some of the aesthetic implications related to Google Arts & Culture’s AI image methods. The platform Google Arts & Culture (https://artsandculture.google.com/) provides its users with instant online access to images of, and information about, artworks and cultural artefacts from 1,000 cultural institutions across the globe. While the actual cultural artefacts in these institutions present a vast heterogeneity in terms of periods, media, material, geographical origin, cultural signification, etc., the curated search results provided by the platform consist of image representations that are automatically selected by a mix of black-boxed AI methods, apparently involving: Visual feature detection and object recognition; metadata analysis; and visual text recognition. 

The paper investigates some of the aesthetic implications of the seamless mix and opaqueness of these AI methods. Not only is the user’s sensuous experience constantly being shuffled between visual retinal and textual cognitive input, but also the platform, to a large extent, dictates the user’s implied subject positions (platform explorer, museum visitor, knowledge seeker, artistic appreciator etc.), and pendulates the user between narrow closures determined by the platform’s categorical boundaries and uncontrollable openings in the form of overwhelming piles of heterogenous images. 

Bio: Lotte Philipsen is associate professor in Art History at Aarhus University, Denmark