PRINCIPAL INVESTIGATOR: Matias del Campo
The purpose of this article is to discuss the application of artificial intelligence (AI) in the design of the Deep House project (Fig.1), an attempt to use estrangement as a method to emancipate a house from a canonical approach to the progressive design of a one-family house project. The main argument in this text is that the results created by Artificial Neural Networks (ANNs), whether in the form of GANs, CNNs, or other networks, generate results that fall into the category of Estranged objects. In this article, I would like to offer a possible definition of what architecture in this plateau of thinking represents and how it differentiates from previous attempts to use estrangement to explain the phenomena observed when working with NNs in architecture design. A potpourri of thoughts that demonstrate the intellectual tradition of exploring estrangement, especially in theater and literature, that ultimately circles back to its implications for architecture, particularly in light of the application of AI.
Reviewer #1: The abstract should not be simply copy-pasted from the first two paragraphs of the introduction section. The abstract should instead succinctly state the contribution of the paper, while articulating with sections pertaining to the hypothesis, objectives, methodology, results, evaluation, and discussion.
They show a close relationship with techniques used in poetic imagery and aspects of analogies as a literary technique. Unfortunately, the problem is a little bit more complex due to the nature of curve fitting and the ability of NN’s to induce aspects of defamiliarization into the results, but let us keep it that way for the moment being. In addition, Margaret Boden described two more creativity concepts: exploratory and transformational6. Exploratory creativity involves the interrogation of structured conceptual spaces, which results in ideas that are not only novel but also surprising in their novelty. Exploring this ‘idea space’ resembles the nature of latent walks through the results of a data set interpolation. In a Generative Adversarial Network (GAN), the generative model learns to map points in the latent space to generate images. The latent space does not have any semantic information except the meaning applied through the generative model. Despite this, the structure of the latent space can be explored via interpolation between datapoints and performs robust vector arithmetic within the latent space, thus targeting specific effects on the generated images7. Lastly, Boden talks about transformational creativity. This concept of creativity involves the transformation of one or more dimensions of space for new structures to emerge, which could not be achieved previously. The more fundamental the dimension is concerned, and the more powerful the transformation, the more surprising the new ideas will be8. The last two concepts of creativity, Exploratory and Transformational, cross-pollinate each other. Consider, for example, how architects interrogate a design space. There are constraints for the design based on program, budget, building code, structural loads, environmental conditions, cultural preference, etc. However, architects are able to navigate and explore the design space to find surprisingly ingenious solutions to the spatial problem at hand. Occasionally, an architect creates a transformational idea, think Alberti’s paradigm9 or Adolf Loos’s Raumplan10. (Fig.4)
To round up Boden’s interrogation of creativity, it might be necessary to mention her differentiation between H Creativity and P Creativity. The results of H Creativity (Historical creativity) provide ideas that are relevant and innovative on a historic scale and that no one had before, a testament to groundbreaking inventions and discoveries such as Newton’s laws of motion, Einstein’s relativity theory, or Corbusier’s Maison Domino. P Creativity (Psychological Creativity) refers to the individual mind and the ideas it generates, regardless of whether someone else already had the same idea11. A typical disposition in architecture, where the habit of maintaining the cliché of the lone genius results in the wheel getting reinvented repeatedly. On the other hand, Demis Hasabis, the CEO of Deepmind AI, postulates that creativity can be divided into three camps: interpolative, extrapolative, and inventive. In his lecture for the Royal Academy of Arts12, Hasabis pointed out the fact that neither Artificial Neural Networks (ANN is the basis of most AI applications today) nor Biological Neural Networks (Human brains) are particularly good in the category of invention. (Our neuroscientist client confirmed that.
However, humans still have an edge over machines in the category of extrapolation. Perhaps a quick run-down on the definition of these terms. For the term extrapolation, we will rely on a definition lifted from statistics, which describes extrapolation as a process capable of constructing unknown data points outside a discrete set of known data points13. Interpolation, on the other hand, relies on a technique capable of constructing unknown points in the latent space between known points of the given dataset. Machines are generally much faster than humans in interpolation; however, this process can introduce results that can be less meaningful and are occasionally prone to greater uncertainty. In the case of neural architecture, this is not always a negative result, as aspects of uncertainty can produce surprising and ‘different’ results14. Do these processes indeed produce something ‘new’? I would argue that no, they don’t. Interpolations and extrapolations are intrinsically related to the material used to interpolate or extrapolate. The results do not appear out of thin air; there is always some sort of existing dataset. Whether it is a collection of images or data for a machine or the memories in our minds for humans. As Parmenides of Elea put it: Nothing comes from nothing –“Ex nihilo, nihil fit”15 – or, to use a more modern expression: “There is no such thing as a free lunch.”16
After this short excursion into some of the current concepts on creativity and artificial intelligence, it is possible, with some certainty, to deduce that Boden’s and Hassabis’s definitions of creativity are fairly speculative and lack the neuroscientific foundation to understand how creative processes are performed in our brains. Instead, I would like to rely here on the work of Anna Abraham, a neuroscientist focused on creativity, who has postulated a series of exciting concepts that expand our understanding of creativity as a neural process17. In particular, she gives insight into the semantic and cognitive neural networks at play in creative processes, specifically about the generation of original ideas. This might bring us to an end to the debate on whether AI can be creative. As the research shows, it might be a moot point to discuss whether AI can be creative if we lack a solid understanding of what creativity is and how to encapsulate it in a mathematical algorithm that we can apply in an artificial neural network. The same holds for the problem of the ‘new’. The more interesting point for me to ask is why the results created by artificial neural networks are so different? Strange? Provocative? Not only concerning its aesthetic properties, but there are instances where results created by NNs profoundly question standard design conventions, whether in the form of urban designs or, as in the present case, the planning of a house.
An excursion into estrangement and defamiliarization
As demonstrated in the previous sections, it might be a moot exercise to debate the creativity of AI. Much more interesting is the ontological diagnosis of the results generated by NNs. Why they are so strange can be explained by the technical properties of curve fitting – however, in this section of my examination, I instead attempt to reveal the cultural signifiers present in the results. Here, the concept of estrangement comes into play. Introduced by Russian formalist Viktor Shlovsky in 1917 in his famous article Art as Technique18, the concept’s main characteristic consists of taking everyday objects or situations and inducing enough strange qualities into them to increase the attention of the observer. This is similar to the behavior of Attentional Neural Networks (AttNNs). AttNNs allow attention-driven multistage refinement for fine-grained text-to-image generation. Attention in neural networks mimics how humans can concentrate on particular aspects of their sensory input and blend out the rest around them19. The currently popular text-to-image algorithms, such as Disco Difussion20 and Midjourney21, can be considered milestones in the development of successful text-to-image applications. In the right hand, the result is a piece of art that is familiar enough to be recognized as a person, a car, an animal, or a building but strange enough to evoke our attention.
The method of estrangement was picked up by the genre-transforming German Theater director Bertolt Brecht (Fig.5), who was directly inspired by Viktor Shlovsky. Brecht, however, described it as the ‘Entfremdungseffekt’ (the effect of making strange); it is also called a-Effekt22 or distanzierende Wirkung23, Deutsche Verfremdungseffekt24 or V-Effekt25, a central concept for the works of Brecht as demonstrated in plays such as Mutter Courage26, The Life of Galilieo27 and The Good Person of Szechwan28. It includes applying techniques that allow the audience to distance themselves from participating on an emotional level with the play by reminding them of the artificiality of the theater performance. These methods include subtitles or illustrations projected on screens, actors breaking through the fourth wall to deliver exposition, summarize events, or sing a song. In addition, it includes abstract stage designs that do not depict particular locations but rather reveal the artificiality of the play by intentionally showcasing stage machineries such as lights and ropes. A method to control the audience’s identification with the figures on stage, thus enhancing the attention given to the reality reflected by the drama on stage. The Verfremdungseffekt was not intended as a purely aesthetical technique but as a political mission. To that extent, understanding Neural Architecture’s political imprint would be interesting. Does it encapsulate the political dimension in the respective datasets? If I do a dataset of Albert Speer architecture, does an NN produce Fascist architecture? This is highly questionable. If at all, I would instead take the position that labeling a dataset is a political act. Brecht was inspired by thinkers such as Hegel, Marx, and Shklovsky. Here, in particular, by Shklovsky’s concept of Ostranenie29. The goal of Brecht was to allow the audience of his plays to understand complex historical and societal developments using the abstraction provided by the Verfremdungseffekt30. To this end, Brecht provided the audience with an active role in the play’s staging. The unusual and strange stage effects force the audience to consider aspects of the artificial and the part of each object in actual events. In doing so, the audience was allowed to maintain an emotional distance from the problems in order to interrogate them rationally and intellectually31. What does all that have to do with architecture, you ask? Well, according to Katja Hogenboom32, Estrangement and Defamiliarization allow architecture to regain its role in society by emancipating itself and engaging in a new social commitment. In challenging the existing cliches of architecture and actively liberating opportunities for architectural progress, Estrangement allows breaking through the conventions of the architectural status quo.
The main argument in this article is that the results created by Artificial Neural Networks (ANNs), wither in the form of GANs, CNNs, or other networks generate results that fall into the category of Estranged objects. However, it is certainly not sufficient for a building just to have a strange appearance, to be able to provoke the necessary break away from current architectural conventions. Take, for example, Frank Gehry’s Guggenheim Museum in Bilbao (Fig.6).
Regardless of its status as an icon of a strange new form, it does not question the status quo of a museum program; it remains a self-referential spectacle in steel, glass, and titanium. It wraps a complex form around a conventional museum program that celebrates consumerism and musealization. Of course, using methods such as estrangement, subversion, reflexivity, the absurd, and similar techniques as an end to activate the spectator (or the user of architecture, for that matter) and provoke emancipation is not entirely novel. As laid out in the previous section, Estrangement and Defamiliarization are actively present in works from G.W.F. Hegel, Karl Marx, Viktor Shklovsky, and Bertolt Brecht. Of course, a thorough interrogation of aspects of estrangement and defamiliarization cannot be complete without at least mentioning Sigmund Freud’s essay ‘Das Unheimliche’ (the Uncanny)33. Freud defines the uncanny as deeply rooted in what is known to individuals as common or familiar. Deviations from the familiar -defamiliarizing aspects of life- result in emotional responses akin to fear and curiosity. In his essay, Freud demonstrates psychoanalytically why this is the case. This is more than just an exploration into a particular emotional response based on estranged stimuli; it is the basis for Freud’s theory on unconscious mental activity, which in return forms a core part of psychoanalysis to this very day. In the context of artificial intelligence, discussing psychoanalysis is a massive undertaking that cannot be fairly elaborated on in a short essay such as this one; however, the idea alone that the uncanny forms a major part in understanding unconscious mental activity might be an interesting area of exploration regarding the use of neural networks in the creative industries, and thus in architecture. Perhaps as a means to better understand our clients? Or as a basis for the interrogation of consciousness in general? Graham Harman is a recent author who explored uncanny aspects in his written work. His book, ‘Strange Realism, Lovecraft, and Philosophy’34. makes a valiant effort to understand estrangement and defamiliarization as a valid category of aesthetic interrogation and uses HP Lovecraft’s writings as a guiding principle. (Whether this elevates H.P. Lovecraft from Pulp to High Literature is an entirely different story altogether). After this potpourri of thoughts that demonstrate the intellectual tradition of exploring estrangement as a valid methodology, especially in theater and literature, let us circle back to its implications for architecture, especially in light of the application of AI. Of course, there is also a tradition of applying estrangement techniques in architecture. From strange figures and monsters in the Sacro Bosco of Bomarzo to Eisenman (who actually wrote about estrangement)35, to Roland Snook’s strange tectonics36, there has always been a cohort in the architecture discipline interested in innovative provocations possible through a technique such as Estrangement. To understand the ambition of neural architecture and the theoretical framework established by discussing the ability of Estrangement to explain the phenomena observed when working with NNs in architecture design, I would like to offer a possible definition of what architecture in this plateau of thinking represents and how it differentiates from previous attempts to use Estrangement as a design method37. In discussing the affect of NN on architecture, it becomes very quickly clear that architecture is not an inanimate object but rather constitutes an animate object in constant transformation while being populated and gazed upon. Architecture is not a pragmatic reflection of its function; instead, it can be considered activated matter driven by an agile approach to information, behavior, and perception over time. The result is a material entity with aesthetic, organizational, programmatic, social, and cultural properties38 Estrangement in this frame of thinking not only constitutes an interesting novel aesthetic – it would not make it justice to be described in these limited terms – but it offers an opportunity to mobilize, provoke, and install emancipating alternatives, or as Katja Hogenboom put it ‘situated freedoms’39 in complex conditions. The societal potential includes the interrogation of transformational micropolitics with the potential for a renewed concept of the private and the public space through methods of estrangement that result in an architecture that projects novel forms of emancipation. Explaining what these emancipations entail in their entirety would be a longer project, so I will leave it at that for now. The Deep House project, presented in this article, is an attempt to use estrangement as a method to emancipate the design of a house from a canonical approach to a progressive design of a one-family house project. Of course, estrangement can only be achieved when the result maintains enough familiar features to be recognized as a specific object, as in this case with a common modern house design. In the case of the Deep House, the owners desire to experiment with the feature-recognition abilities of neural networks. SPAN, the practice run by Sandra Manninger and myself, has done similar experiments before, such as the Robot Garden for the University of Michigan or the Generali Center design for the Mariahilferstraße in Vienna, Austria (Fig.7), which was based on a dataset of brutalist building facades. To be clear, this is not an attempt to imitate an existing architectural style. Heck, it might not be about style at all! It is more an attempt to inform an algorithm with the aid of a specific dataset about the organizational features of mass and volumes. Once it learns these features, a deep learning approach is utilized in a generative role to produce models that respond to fitness criteria such as volumetric proportions, daylight diffusion, visual balance, and organizational properties.
As in the example of the Generali Center, the Deep House project uses a pixel projection technique to fold the two-dimensional information present in the images resulting from a latent walk into the three-dimensional space. The caveat with this method is that information gets lost every time matter is folded into two-dimensional space and back to 3D. The multiple folding happening in this project means a large amount of information gets lost. Still, at the same time, that lack of information might help in generating surprising results in the StyleGAN2 process. The pixel projection consists of three planes (x,y,z) used as the basis for three alpha channel images containing two facades and a plan resulting from the latent walks (Fig. 8). The datasets of these latent walks consist of a midcentury modern house facade dataset and the respective plans. Adding a subdivision and displacement modifier allows intersecting the projection of the alpha images in the center of the XYZ cube. An intersection boolean modifier limits the results to the boundaries of the alphas. By adding a remesh and laplacian smoothing algorithm, it is possible to control whether the results are ‘soft’ or ‘hard.’ For the Deep House, we opted for a relatively coarse remeshing of the resulting voxels, which partially explains the strictly rectilinear nature of the design.