A survey of comics research in computer science

Graphical novels such as comics and mangas are well known all over the world. The digital transition started to change the way people are reading comics, more and more on smartphones and tablets and less and less on paper. In the recent years, a wide variety of research about comics has been proposed and might change the way comics are created, distributed and read in future years. Early work focuses on low level document image analysis: indeed comic books are complex, they contains text, drawings, balloon, panels, onomatopoeia, etc. Different fields of computer science covered research about user interaction and content generation such as multimedia, artificial intelligence, human-computer interaction, etc. with different sets of values. We propose in this paper to review the previous research about comics in computer science, to state what have been done and to give some insights about the main outlooks.


Introduction
Research on comics have been done independently in several research fields such as document image analysis, multimedia, human-computer interaction, etc. with different sets of values. We propose to review the research of all of these fields and to organize them in order to understand what is possible to do about comics with the state of the art methods. We also give some ideas about the future possibility of comics research.
We introduced a brief overview of comics research in computer science [5] during the second edition of the international workshop on coMics ANalysis, Processing and Understanding (MANPU). The first edition of MANPU workshop took place during ICPR 2016 (International Conference on Pattern Recognition) and the second one took place during ICDAR 2017 (International Conference on Document Analysis and Recognition). It shows that comics can interest a large variety of researchers from pattern recognition to document analysis. We think that the multimedia and interface communities could have some interest too, so we propose to present the research about comics analysis with a broader view.
In the next part of the introduction we will explain the importance of comics and its impact on the society with a brief overview of the open problems.

Comics and society
Comics in the USA, mangas in Japan or bandes dessinées in France and Belgium are graphic novels which have a worldwide audience. They are respectively an important part of the American, Japanese and Francophone cultures. They are often considered as a soft power of these countries, especially mangas for Japan [39,27]. In France, bandes dessinée is considered as an art, and is commonly refereed as the "ninth art" [67] (as compared to cinema which is the seventh art).
However, several years ago it was not the case. Comics was considered as "children literature" or "sub-literature" as it contains a mixture of images and text. But more Fig. 1 We arranged the comics research into three interdependent categories: 1) content analysis, 1) content generation, and 3) user interaction.
lately comics got a great deal of interest when people recognized it as a complex form of graphic expression that can convey deep ideas and profound aesthetics [12].
The market of comics is large. According to a report published in February 2017 by "The All Japan Magazine and Book Publisher's and Editor's Association" (AJPEA), the sale of mangas in Japan represents 445.4 billion yens (around 4 billion dollars) in 2016 1 . In this report, we can see that the market is stable between 2015 and 2014, but a large progression of the digital market can be observed: it almost doubled from 2014 to 2016. The digital format has several advantages for the readers: it can be displayed on smartphones or tablets and be read anytime, anywhere. For the editors, the cost of publication and distribution is much lower as compared to the printed version.
However, even if the format changed from paper to screen, no added value has been proposed to the customer. We think that the democratization of digital format is a good opportunity for the researchers from all computer science fields to propose new services such as augmented comics, recommendation systems, etc.

Research and open problems
The research about comics is quite challenging because of the nature of this medium. Comics contains a mixture of drawings and text. To fully analyze and understand the content of comics, we need to consider natural language processing to understand the story and the dialogues; and computer vision to understand the line drawings, characters, locations, actions, etc. A highlevel analysis is also necessary to understand events, emotions, storytelling, the relations between the characters, etc. A lot of related research has been done for covering similar aspect for the case of natural images (i.e. photographic imagery) and videos by classic computer vision. However, the hight variety of drawings and low availability of labeled dataset make the task harder than natural images.
We organized the research about comics in the three following main categories, as illustrated in Fig. 1: 1. content analysis: getting information about raw images and extracting from high to low-level structured descriptions, 2. content generation: comics can be used as an input or output to generate new contents. Content conversion and augmentation are possible from comics to comics, comics to other media, other media to comics; 3. user interaction: analyzing human reading behavior and internal states (emotions, interests) based on comics contents, and reciprocally, analyzing comics contents based on human behavior and interactions.
Research about comics in computer science has been done covering several aspects but is still an emerging field. Much research has been done by researchers from the DIA (Document Image Analysis) and AI (Artificial Intelligence) communities and focuses on content analysis, understanding, and segmentation. Another part of the research is addressed by graphics and multimedia communities and consists in generating new contents or enriching existing contents such as adding colors to black and white pages, creating animation, etc. The last aspect concerns the interaction between users and comics which is mainly addressed by the HCI (Human-Computer Interaction) researchers. All these three parts are inter-dependent: segmenting an area of a comic page is important if we want to manipulate and modify it, or if we want to know which area the user is interacting with. Analyzing the user behavior can be used to drive the content changes or to measure the impact of these changes on the user.
In Section 3 We will state in more detail the current state-of-the-art and discuss the open problems. Large datasets with ground truth information such as layout, characters, speech balloon, text, etc. are not available so using deep learning is hardly possible in such conditions and most of the researchers proposed handcrafted features or knowledge-driven approaches until the very recent years. The availability of tools and datasets that can be accessed and shared by the research communities is another very important aspect to transform the research about comics, we will talk about the major existing tools and datasets in Section 4.
In the next parts of the paper all the research which are applied to comics, mangas, bande dessinées or any graphics novels will be referred as "comics" in order to simplify the reading. We start the next section of the paper with general information about comics.
2 What is comics?
The term comics (as a singular uncountable noun) refers to the comics medium; such as television, radio, etc. comics is a way to transfer information. We can also refer to a comic (as a countable noun), in this case, we refer to the instance of the medium such as a comic book or a comic page.
As for any art, there are strictly no rules for creating comics. The authors are free to draw whatever and however they want. Still, some classic layouts or patterns are usually used by the author as they want to tell a story, transmit feelings and emotions, and drive the attention of the readers [31]. The author needs experience and knowledge to drive smoothly the attention of the readers through the comics [8]. Furthermore, the layout of comics is evolving over time [52], moving away from conventional grids to a more decorative and dynamic way.
Usually, comics are printed on books and can be seen as a single or double pages. When the book is opened, the reader can see both pages so some authors use this physical layout as part of the story: some drawings can be spread in two pages, and when the reader turn one page something might happen in the next page. Figure 2 illustrates a classic comics content.
A page is usually composed of a set of panels defining a specific action or situation. The panels can be enclosed in a frame and separated by a white space area named gutter. The reading order of the panels depends on the language. For example, in Japanese (see Fig. 4), the reading order is usually from right to left and top to bottom. Speech balloons and captions are included in the panel to describe conversations or the narration of the story. The dialog balloons also have a specified reading order which is usually the same as the reading order of the panels. Some sound effects or onomatopoeias are often included to give more sensations to the reader such as smell or sound. Japanese comics often contains "manpu" (see Fig. 3) which are symbols used to visualized feelings and sensations of the characters such as sweating marks on the head of a character to show that he feels uncomfortable even if he is not actually sweating.
The authors are free to draw the characters as they want, so they can be deformed or disproportioned as illustrated in Fig. 7. In some genres such as fantasy, the characters can also be non-human which makes the segmentation and recognition task challenging. There are also many drawing effects such as speed lines, focusing lines, etc. For example, in Fig 2, a texture surrounding the female character in the lower-right panel represents her warm atmosphere as contrasted with the cold weather.  Even if more and more digitized versions of the printed version are available few comics are produced digitally and taking advantage of the new technology. Figure 5 illustrates an example of digital comics taking advantage of tablet functions: the images are animated continuously and the user can tilt the tablet to control the camera angle. This comics is created by Andre Bergs 2 and is freely available on App store and Google Play. We imagine that in the future, it could be possible to create such interactive comics automatically.

Comics research
We organized the studies done about comics in computer science into three main categories that we will present in this section. One of the main research fields focuses on analyzing the content of comics images, extracting the text, the characters, segmenting the panels, etc. Another category is about generating new content from or for comics. The last category is about analyzing the reader's behavior and interaction with comics.

Content analysis
In order to understand the content of comics and to provide services such as retrieval or recommender systems, it is necessary to extract the content of comics. The DIA community started to cover this problem with classic approaches. Images can be analyzed from the low levels such as screentones [29] or text [3] to the high level such as style [13] or genre [19] recognition. Some elements are interdependent; for example finding the text and speech balloons, as one can contain the other. But also the positions can be relative to each other, as the speech balloon is usually coming from the mouth of a character. These elements are usually grouped inside a panel, but not necessarily. As the authors are free to draw whatever and however they want, there is a wide disparity among all comics which make the analysis complex. For example, some authors exaggerate the facial deformation of the face of a character to make him angrier or more surprised.
We present the related work from the low level to high-level analysis as follow.

Textures, screentones, and structural lines
Black and white textures are often used to enrich the visual experience of non-colored comics. It is especially used for creating an illusion of shades or colors. However, the identification and segmentation of the textures is challenging as they can have various forms and are sometimes mixed with the other parts of the drawing. Ito et al. proposed a method for separating the screentones and line drawings [29]. More recently, Liu et al. [43] proposed a method for segmenting the textures in comics.
Extracting the structural lines of comics is another challenging problem which is related to the analysis of the texture. The result of such an analysis is displayed in Fig. 6. The difference between structural lines and arbitrary ones must be considered carefully. Li et al. [41] recently proposed a deep network model to handle this problem. Finding textures and structural lines is an important analysis step to generate colorized and vectorized comics.  6 Structural line extraction. For each pair of images, the one on the left is the original image, the one on the right is obtained after removing the textures and detecting the structural lines by Li et al. algorithm [41]. Downloaded from: http://exhibition.cintec.cuhk.edu. hk/exhibition/project-item/manga-line-extraction/.

Text
The extraction of text (such as Latin or Chinese) characters has been investigated by several researchers but is still a difficult problem as many authors write the text by hand.
Arai and Tolle [3] proposed a method to extract frames, balloon, and text based on connected components and fixed thresholds about their sizes. This is a simple approach which works well for "flat" comics, i.e. conventional comics where each panel is defined by a black rectangle and has no overlapping parts.
Rigaud et al. also proposed a method to recognize the panels and text based on the connected components [63]. By adding some other features such as the topological and spatial relations, they successfully increased the performance of [3].
More recently, Aramaki et al. combined connected component and region-based classifications to make a better text detection system [4]. A recent method also addresses the problem of speech text recognition [58].
In order to simplify the problem, Hiroe and Hotta have proposed to detect and count the number of exclamation marks in order to represent a comic book by its distribution of exclamation marks or to find the scene changes [28].

Faces and pose
One of the most important elements of comics is the characters (persons) of the story. However, identifying the characters is challenging because of the posture, occlusions, and other drawing effects. Also, the characters can be humans, animals, robots or anything with various drawing representations. Sun et al. [70] proposed to locate and identify the characters in comics pages by using local feature matching. New methods have recently been proposed to recognize the face and characters in comics based on deep neural networks [14,53,50].
Estimating the pose of the character is another challenge. As we can see in Fig. 8, if the characters have human proportion and are not too deformed, they can be well recognized by a popular approach such as Open  Pose [10]. Knowing the character poses could lead to activity recognition, but a method such as Open Pose will fail on almost all comics.

Balloons
The balloons are an important component of comics where most of the information is conveyed by the discussion between the protagonists. So one important step is to detect the balloons [18] and then to associate the balloons to the speaker [61].
The shape of the balloon conveys also information about the speaker feelings [77]. For example, a balloon with wavy shape represents anxiety, an explosion shape represents the anger, a cloudy shape represents joy, etc.

Panel
The layout of a comics page is described by Tanaka et al. as a sequence of frames named panels [71]. Several methods have been proposed to segment the panels, mainly based on the analysis of connected components [2], [63] or on the page background mask [51].
As these methods based on heuristics rely on white backgrounds and clean gutters, Iyyer et al. recently proposed an approach based on deep learning [30] to process eighty-year-old American comics. Fig. 9 Example application of illustration2vec [64]. The model recognize several attributes of the character such as her haircut and clothes. The web demo used to generate this image is not online anymore.

High level understanding
Rigaud et al. proposed a knowledge-driven system that understands the content of comics by segmenting all the sub-parts [59]. But understanding the narrative structure of comics is much more than simply segmenting its different sub-parts. Indeed, the reader makes inferences about what is happening from one frame to another by looking at all graphical and textual elements [46].
Iyyer et al. introduced some methods to explore how readers connect panels into a coherent story [30]. They show that both text and images are important to guess what is happening in a panel by knowing the previous ones.
Daiku et al. [19] proposed to analyze the comics storytelling by analyzing the genre of each page of the comics. Then the story of a comic book is represented as a sequence of genres such as: "11 pages of action", "5 pages of romance", "8 pages of comedy", etc.
Analyzing the text of the dialogues and stories has not been investigated yet specifically for comics. Similar research as sentiment analysis [48] could be applied to analyze the psychology of the characters or to analyze and compare the narrative structure of different comics.
From the cognitive point of view, Cohn proposed a theory of "Narrative Grammar" based on linguistics and visual language which are leading the understanding process [16]. A lot of information is inferred by the reader who is constructing a representation of the depicted pictures in his mind. This is how we can recognize that two characters drawn slightly in a different way are the same, or that a character is doing an action by looking at a still image. These concepts must be inferred by the computer too, in order to obtain a high-level representation of comics.

Applications
From these analyses, retrieval systems can be built, and some have already been proposed in the literature such as sketch [45,49] or graphs based [40] retrieval. The drawing style has also been studied [13]. The possible applications are artist retrieval, art movement retrieval, and artwork period analysis.
Saito and Matsui proposed a model for building a feature vector for illustrations named illustration2vec [64]. As showed on Fig.9, this model can be used to predict the attributes of a character such as its hair or eye color, the size of the hair, the clothes worn by the character, etc. and to research specific illustrations. Vie et al. proposed a recommender system using the illustration comics covers based on illustration2vec in a cold-start scenario [72].

Conclusion (content analysis)
Segmenting the panels or reading the text of any comics is still challenging because of the complexity of some layouts and the diversity of the content. Figure 2 illustrates the difficulty of segmenting the panels. Most of the current methods focus on using handcrafted features for the segmentation and analysis and will fail on an unusual layout.
The segmentation of faces and body of the characters is still an open problem and a large amount of labeled data will be necessary to adapt the deep learning approaches.
Even if the text contains very rich information, surprisingly few methods have been proposed to analyze the storyline or the content of comics based on the text. Also, some parts of comics has not been addressed at all, such as the detection of onomatopoeias.
The future research about high-level information should be more considered as it can be used to represent information that could interest the reader such as the style or genre, the storytelling, etc.

Content generation
The aim of content generation or enrichment is to use comics to generate new content either based on comics or other media.

Vectorization
As most of comics are not created digitally, vectorization is a way to transform scanned comics to a vector representation for real-time rendering with arbitrary resolution [78]. Generating vectorized comics is necessary for visualizing them nicely in digitized environments. This is also an important step for editing the content of comics and one of the basic step of comics enrichment [80].

Colorization
Several methods have been proposed for automatic colorization [54,66,15,23,79] and color reconstruction [37], as comics with colors can be more attractive for some readers. Colorization is quite a complex problem as the different parts of a character such as his arms, hands, fingers, face, hair, clothes, etc. must be retrieved to color each part in a correct way. Furthermore, the poses of a character can be very different from each other: some parts can appear, disappear or be deformed. An example of colorization is displayed in Fig. 10.
Recently, deep learning based colorization approach has been used for creating color version manga books which are distributed by professional companies in Japan 3 .

Comics and character generation
One problem for generating comics is to create the layout and to place the different components such as the characters, text balloons, etc. at a correct position to provide a fluid reading experience. Cao et al. proposed a method for creating stylistic layout automatically [7] and then another one for placing and organizing the elements in the panels according to high-level user specification [8].
The relation between real-life environment or situations and the one represented in comics can be used to generate or augment comics. Wu and Aizawa proposed a method to generate a comics image directly from a photograph [76].
At the end of 2017, Jin et al. [33] presented a method to generate automatically comics characters. An example of a generated character by their online demo 4 is displayed Fig. 11. The result of the generation is not Fig. 11 Example of random character generation based Jin et al. method [33]. In this example, we set some attributes such as green hair color, blue eyes, smile and hat. Fig. 12 Example of smiling animation in the conceptual space [73]. Similar animation could be obtained for comics images. Image source: https://vusd.github.io/toposketch/ always visually perfect, but still, this is a powerful tool as an unlimited number of characters can be generated.

Animation
As comics are still images, a way to enhance the visualization of comics is to generate animations. Recently, some researchers proposed a way for animating still comics images through camera movements [9,32]. Several animation movies and series have been adapted in comics paper book and vice versa. Some possible outlook could be to generate an animated movie from a paper comics or a paper comics from an animated movie.
For the natural images, some methods have been proposed to animate the face of people by using latent space interpolations. As illustrated in Fig. 12 the latent vectors can be computed for a neutral and smiling face to generate a smiling animation [73].
Another application is to use extract the facial keypoints and to use another source (text, speech, or face) to animate the mouth of the character. For example, this has been done for generating photorealistic video of Obama speech based on a text input [38].

Media conversion
More broadly, we can imagine to convert text, videos, or any content into comics, and vice-versa. This problem can be seen as media conversion. For example, Jing et al. proposed a system to convert videos to comics [34]. There are many challenges to do a successful conversion: summarizing the videos, stylizing the images, generating the layout of comics and positions of text balloons.
An application which as not been done to comics but to natural videos is to add generated sound to a video [81]. No application has been done for comics, but we could imagine a similar application to generate sound effects (swords which are banging to each other, a roaring tailpipe, etc.) or atmosphere sounds (village, countryside, crowd, etc.).
Creating a descriptive text based on comics or generating comics based on descriptive text could be possible in the future, as it has been done for the natural images. Reed et al. [56] proposed a method for automatic synthesis of realistic natural images from text.
We can also imagine changing the content, adding or removing some parts, changing the genre or style depending on the user or author preference.

Conclusion (content generation)
In order to generate contents, some model or labeled data are necessary. In order to generate automatically characters, Jin et al. used around 42000 images. Deep learning approaches such as Generative Adversarial Networks (GAN) [24] has been widely used for natural image applications such as style transfer [83], reconstructing 3D models of objects from images [75], generating images from text [56], editing pictures [82], etc. These applications could be done for comics too.
Another possibility to enhance comics is to add other modes such as sound, vibrations, etc. Adding sounds should be easily possible by using the soundtracks from animation movies. But, in order to be able to produce these effects at a correct timing, information about the user interactions is necessary. This is possible by using an eye tracker or detecting when the user turns a specific page in real time.

User interaction
Apart from the content analysis and generation, we have identified another category of research based on the interaction between users and comics. One part consists of analyzing the user himself instead of analyzing comics. For example, we would like to understand or predict what the user feels or how he behaves while reading comics. Another part consists in creating new interfaces or interactions between the readers and comics. Also, new technology can be used to improve the access for impaired people. Fig. 13 One the left: eye gaze fixations (blue circles) and saccades (segment between circles) of one reader. One the right: heat map accumulated over several readers; the red color corresponds to longer fixation time.

Eye gaze and reading behavior
In order to know where and when a user is looking at some specific parts of a comic, researchers are using eye tracking systems. By using eye trackers it is possible to detect how long a user spends to read a specific part of a comic page.
Knowing the user reading behavior and interest is an important information that can be used by the author or editors as a feedback. It also can be used to provide other services to readers such as giving more details about the story of a character that a specific user likes, removing part of battle if he does not likes violence, etc.
Carroll et al. [11] showed that the readers tend to look at the artworks before reading the text. Rigaud et al. found that, in France, the readers spend most of the time at reading the text and looking at the face of the characters [60]. The same experiment repeated in Japan lead to the same conclusion, as illustrated in Fig. 13.
Another way to analyze how the readers understand the content of comics is to ask them to manually order the panels. Cohn presented different kinds of layouts with empty panels and showed that various manipulations to the arrangement of panels push readers to navigate panels in alternate routes [17]. Some cognitive tricks can ensure that most of the readers will follow the same reading path.
In order to augment comics with new multimedia contents such as sounds, vibration, etc. it is important to trigger these effects at a good timing. In this case, detecting when the user turns a page or estimating which position he is looking at will be useful.

Emotion
Comics contains exciting contents. Many different genres of comics exist such as comedy, romance, horror, etc. and trigger different kinds of emotions to the read- Fig. 14 The user wear the E4 wristband, measuring his physiological signals such as heartbeat, skin conductance and skin temperature.
ers. Much research has been done on emotion detection based on face image and physiological signals such as electroencephalogram (EEG) while watching videos [36,69,68]. However such research has not been conducted while reading comics. We think that analyzing the emotion while reading might be more challenging as movie contain animations and sounds that might stimulate more the emotions of the user.
By recording and analyzing the physiological signals of the readers as illustrated in Fig. 14; Lima Sanches et al. showed that it is possible to estimate if the user is reading a comedy, a romance or a horror comics, based on the emotions felt by the readers [42]. For example, when reading a horror comic book, the user feels stressful and his skin temperature is decreasing.
Emotions are usually represented as two axes: arousal and valence, where the arousal represents the strength of the emotion and the valence relates to a positive or negative emotion. Matsubara et al. showed that by analyzing the physiological signals of the reader, it is possible to estimate the reader's arousal [44].
Both experiments are using the E4 wristband 5 which contains a photoplethysmogram sensor (to analyze the blood volume pulse), an electrodermal activity sensor (to analyze the amount of sweat), an infrared thermopile sensor (to read the peripheral skin temperature), and a 3-axis accelerometer (to captures motion-based activity). Such device is commonly used for stress detection [35,25].
Still, each reader has is own preferences and feels emotions in a different way while reading so these analyses are quite challenging. Depending on the user state of mind or mood, he might prefer to read content that is eliciting specific kind of emotions. Emotion detection could be used by author or editors to analyze which content stimulate more the readers. 5 https://www.empatica.com/en-eu/research/e4/ Fig. 15 Tanvas tablet enable the user to feel different textures. This could be used to enhance the interaction with comics. Source: https://youtu.be/ohL_B-6Vy6o?t=19s

Visualization and interaction
Comics can be read on books, tablets, smartphones or any other devices. Visualization and interaction on smartphones can be difficult, especially if the screen is small [6]. The user needs to zoom and do many operations which can be inconvenient. Some researchers are also trying to use more interactive devices such as multi-touch tables to attract the users [1].
Another important challenge is to make comics accessible to impaired people. Rayar [55] explained that a multidisciplinary collaboration between Human-Computer Interactions, Cognitive Science, and Education Research is necessary to fulfill such a goal. Up to now, the three main ways to access images for visually impaired people are: audio description, printed Braille description and printed tactile pictures (in relief). Such way could be generated automatically thanks to new research and technology.
New haptic feedback tablet such as the one proposed by Meyer et al. [47] illustrated in Fig. 15 could help visually impaired people to access comics. Others application such as detecting and magnifying the text or moving the comics automatically could be helpful for impaired people.

Education
It has been proven that the representation of knowledge as comics can be a good way to attract students to read [21] or to learn language [65]. It could be interesting to measure the impact on the representation of the knowledge.
Comics could be, for some students, a more interesting way to learn, so using comics in education might be a way to augment their attention level and memory if comics are nicely designed. A challenge related to media conversion is then to transform normal textbooks into comics and to compare the interactions of the students with both books.

Conclusion (user interaction)
The interactions between the user and comics have not been analyzed deeply yet. Many sensors can be used to analyze the user with respect to brain activity, muscle activity, body movement and posture, heart rate, sweating, breath, pupil dilation, eye movement, etc. Collecting such information can give more information about the readers and comics.

Available materials
In this section, we present some tools and datasets which are publicly available for the research on comics.

Tools
Several tools for comics image segmentation and analysis are available on the Internet and can be freely used by anybody, such as: -Speech balloon segmentation [57], -Speech text recognition [62], -Automatic text extraction cbrTekStraktor 6 , -Annotation tool to create ground truth label 7 , -Semi-Automatic Manga Colorization [23]  The speech balloon [57] and text segmentation [62] algorithms are available on the author's Github 10 .
As we can see, even if many papers have been published about comics segmentation and understanding, still few tools are available on the Internet. To improve the algorithms significantly and being able to compare them, making the code available is an important step for the community.

Datasets
Few dataset has been made publicly available because of copyright issues. Indeed, it is not possible for researchers to use and share large dataset of copyrighted Fig. 16 Example of Japanese manga cover page contained in the Manga109 dataset [22]. materials. So making competition and reproducible research is not easy. Hopefully, recently, several datasets have been made available.
The Graphic Narrative Corpus (GNC) [20] provide metadata information for 207 titles such as the authors, number of pages, illustrators, genres, etc. Unfortunately, the corresponding images are not available because of copyright protections. So the usefulness of this dataset is very limited. Still, the authors are willing to share segmentation ground truth and eye gaze data. However such data has not been released yet. eBDtheque [26] 11 contains 100 comic pages, mainly in French language. The following elements have been labeled on the dataset: 850 panels, 1092 balloons, 1550 characters and 4691 text lines. Even if the number of images is limited, creating such detailed labeled data is time-consuming and very useful for the community.
Manga109 [22] 12 is illustrated in Fig. 16. This dataset which contains 109 manga volumes from 93 different authors. On average, a volume contains 194 pages. These mangas were published between the 1970's and 2010's and are categorized into 12 different genres such as fantasy, humor, sports, etc. Only a limited labeled data are available for now such as the text for few volumes. The strong point of this dataset is to provide all pages of one volume which allows analyzing the sequences of pages.
COMICS [30] 13 contains 1,229,664 panels paired with automatic textbox transcriptions from 3,948 American comics books published between 1938 and 1954. The dataset includes ground truth labeled data such as the rectangular bounding boxes of panels on 500 pages and 1,500 textboxes. 11 http://ebdtheque.univ-lr.fr/registration/ 12 http://www.manga109.org/index_en.php 13 https://obj.umiacs.umd.edu/comics/index.html Fig. 17 Example of comics contained in the BAM! dataset [74]. From left to right we selected two images containing the following label: bicycles, birds, buildings, cars, dogs, and flowers.
BAM! [74] 14 contains around 2.5 million artistic images such as: 3D com-puter graphics, comics, oil painting, pen ink, pencil sketches, vector art, and watercolor. The images contain emotion labels (peaceful, happy, gloomy, and scary) and object labels (bicycles, birds, buildings, cars, cats, dogs, flowers, people, and trees). Figure 17 shows a sample of the dataset containing comics. The dataset is interesting due to the labels and large variety of content and languages. However, the images are just examples provided by the authors and cannot always be understood without the previous or following pages.
BAM!, COMICS, Manga109, and eBDtheques are the four main comics datasets that have been made available with the corresponding images. Building such datasets is a time and money consuming task, especially for building the ground truth and labeled data.
The main problem to create such dataset comes from the legal and copyright protection which prevent the researchers to make publicly available image datasets. The content of the dataset is also important depending on the research to proceed. For example, it is interesting to have a variety of comics from different countries, with different languages and genres. It is also interesting to have several continuous pages from the same volumes and several volumes from the same series in order to analyze the evolution of the style of an author, the mentality of the character, or the storyline.

General conclusion
The research about comics in computer science has been done about several aspects. We organized the research into three inter-dependent categories: content analysis, content generation, and user interaction. A mutual analysis of the reader and comics is necessary to understand more about how can we augment comics.
14 https://bam-dataset.org A large part of previous work is focusing on the lowlevel image analysis by using handcrafted features and knowledge-driven approaches. Recent research focuses more on deep learning and high-level image understanding. Still, many applications have been done for natural image and the research about artworks and comics get more attention only very recently [74].
A lot of unexplored fields remain, especially, the content generation and augmentation. Only few companies started to use research for automatic colorization for example, but it is clear that it could be possible to help the authors with content automatic (or semiautomatic) generation of content or animation.
The analysis of the behavior and emotions of the readers have been superficially covered. However, using the opportunity given by new technologies and sensors could be helpful to create the next age of comics. If could be also a way to help the access of comics to impaired people.
For now, few tools and dataset have been made available. Making publicly available copyrighted images is a problem but it would greatly contribute to the improvement of comics research.