1 Introduction

Human-created machines are already able to do all-encompassing types of labor-intensive work. Yet, on many occasions, driven by demands for higher productivity and perhaps simply curiosity, humans have been trying to infuse human intelligence into machines, which constitutes the original motivation of Artificial Intelligence (AI). AI research has been going on for over 65 years and has made impressive achievements in terms of theoretical study and real-world applications [1, 2]. AI is being used almost everywhere and is considered a core skill for the future. The AI market is projected to grow to $190 billion by 2025, at a CAGR (compound annual growth rate) of over 36% between 2018 and 2025 [3].

There are many definitions of artificial intelligence. In the Turing test, AI is defined as the ability of machines to communicate with humans (using electronic output devices) without revealing the identity that they are not humans, where the essential judgment criterion is binary. Marvin Minky, one of the pioneers of AI, defined AI as enabling machines to do things that require human intelligence. The symbolic school believes that AI is the operation of symbols, and the most primitive symbols correspond to the physical entities. Although the descriptions of AI are various, the core of AI is widely believed to be the research theories, methods, technologies, and applications for simulating, extending, and expanding human intelligence. Nowadays, the concept of AI has an increasingly profound impact on human life. As the roles of steam engines in the Age of Steam, generators in the Age of Electricity, and computers in the Age of Information, AI is the pillar of technology in the contemporary era and beyond.

“AI” has become a buzzword in almost every aspect of our lives. The semantic network graph (Fig. 1) is built based on the search results in Web of Science (2021-08-26) and plotted using the VOS Viewer software. It shows the impact degree and the connections of the keywords that are most related to AI. According to the color of the links, it can be ascertained that the “application” of AI has received great attention in the literature. The concept is closely related to “system” sciences while “neural network,” “classification,” and “prediction” are the main focuses in terms of algorithms. The research fields of AI include systems and engineering, brain science, psychology, cognitive science, mathematics, computer science, and many other fields. The application fields of AI are extensive, covering (but not limited to) speech recognition [4, 5], image processing [6, 7], natural language processing [8, 9], smart robots [10, 11], autonomous vehicles [12, 13], energy systems [14, 15], healthcare [16, 17], Fintech [18], etc. In limited areas, AI has surpassed humans. The phenomenon of ever-increasing AI-beyond-humans has triggered a new wave of discussion on how AI may change human society. Although many AI applications are stunning, they have not frightened us in the way of the sci-fi movies such as The Terminator or The Matrix because the capability is designed to be area-specific and non-comprehensive. It will be discussed in the subsequent sections that the mainstream success is limited to ANI (artificial narrow intelligence) rather than AGI (artificial general intelligence). Nevertheless, it is not always the case from a long-term historical and forward-looking perspective. Therefore, at these critical times of rapid development of AI technologies, it is believed to be the time to discuss the past, the present, and the future of both AI tools and AI systems.

Fig. 1
figure 1

Semantic network of artificial intelligence

The article is unfolded along the time axis, focusing on the past, the present, and the future. Specifically, the content is organized as follows. The next part introduces the history of AI development, including the Birth of AI, the AI winters, and the resurgence of AI. In Sect. 3, the state-of-the-art AI and the advancements in the discipline are introduced, including Artificial Narrow Intelligence (ANI), Autonomous systems, Symbiotic Systems, and Human-in-the-Loop Systems. Section 4 discusses the future of AI, including Understanding and Cognition, Artificial General Intelligence, Singularity, and Artificial Super Intelligence. The last section concludes the article.

2 Past

Figure 2 shows a timeline that sketches the history of AI by highlighting the important events. In the following part, a discussion will be made aligned with the timeline in general, yet more detailed examples and an inclination to the underlying factors that impact the prosperity of AI research and development will be presented.

Fig. 2
figure 2

The timeline of AI (copied with permission from [1])

2.1 The birth of AI and the golden age

The period before the year 1956 is regarded as the incubation period of AI. Until that time, scientists and engineers tried to replace part of their mental work with machines. In 1936, mathematician Alan Turing proposed a mathematical model of an ideal computer, which laid the theoretical foundation for later to come electronic computers. Neurophysiologists W. McCulloch and W. Pitts built the first neural network model (M-P model) in 1943 [19]. The M-P model is the first mathematical model constructed to mimic the structure and the working principle of biological neurons. It can be regarded as the earliest artificial neural network. Back in 1949, Hebb proposed the learning mechanism based on neuropsychology [20]. “Hebb learning rule” is an unsupervised learning rule, which can extract statistical features of the training sets and classify data according to data similarity. This is the earliest idea of machine learning (ML), and it is very close to the process of human cognition. In 1952, IBM scientist Arthur Samuel developed a checkers program that could learn implicit models from the current position and instruct subsequent moves [21]. In such context, chess programs were among the earliest work of evolutionary computing. The algorithm compares a modified copy to the best version, and the winner becomes the new standard.

It was John McCarthy who originally coined the term AI at the 1956 Dartmouth Summer Research Project on Artificial Intelligence [2]. He is therefore considered to be the father of artificial intelligence. Since this event, the research into AI has rendered many remarkable achievements, including machine learning, theorem proving, pattern recognition, problem-solving, expert systems, and natural language processing.

In 1957, American psychologist Frank Rosenblatt introduced the “Perceptron” model [22]. It can be used to build a system that uses “neurons” for recognition—the capability of learning profoundly impacted the subsequent design of neural networks and the connection mechanisms. The pioneering work on the perceptron is still a trendy topic in the introductory courses for AI today. In 1960, the perceptron algorithm was transmitted to a physical hardware implementation called “Mark 1 Perceptron” as the artificial brain [23], consisting of an array of perceptrons, connected to a camera. At that time, it was magical and inspiring for the people to see that a machine could be trained to distinguish whether a person in a photograph is a male or a female with reasonable accuracy. In the same year, Newell et al. summarized the thinking rules of humans through psychological experiments. They compiled a general problem-solving program, which could be used to solve 11 different types of problems [24]. A few years later, E. A. Feigenbaum from Stanford University designed an expert system [25]. The system could determine the molecular structure of compounds based on the experimental analysis of mass spectrometers. The study of AI during this period was groundbreaking, with outcomes showing excellent prospects [26]. Meanwhile, the field of computing developed as a field of its own. Transistors and hardware architectures progressed every year, and the essential software and computer programs emerged to implement the AI theories and algorithms.

2.2 AI winters

AI winters refer to the periods when the funding of AI significantly decreased due to disillusionments [1]. There have been two recognized AI winters in history. The first was around the 1970s [27]. Earlier, the researchers had over-confidence in their preliminary results of machine translation. Excessive hype about an optimistic future of machine translation was attracting considerable investments in a context where the US Defense Establishment (the predecessor of DARPA today) expected an automatic Russian-to-English translation during the cold war. It turned out that the challenge of the task was way underestimated. Little progress was made from 1967 to 1976. The failure of machine translation disappointed the investigators and poisoned the development of the whole AI field. In 1966, the Automatic Language Processing Advisory Committee (ALPAC) submitted the report “Language and Machines” to the US government and refuted the feasibility of machine translation [28]. In 1969, Minsky and Papert published the Perceptrons [29], which pointed out the significant flaws of perceptron that perceptron can only solving linear problems. In 1973, the British Scientific Research Council (SRC) presented the Lighthill Report to the British government [30]. These criticisms triggered a significant shrinkage of the funding for fundamental research in AI.

As the effects of the first AI winter slowly faded away, a new wave of interest escalated in AI technologies oriented towards commercial applications. This time, expert systems were extensively implemented based on “if–then” rules set up by domain experts. Once again, its potential was hyped because people had insufficient awareness of its limitations in dealing with highly complex tasks and systems, such as medical diagnosis and vision systems. In 1984, John McCarthy commented that a critical defect of expert systems lies in the lack of knowledge about their limitations [31]. In the late 1980s, another trough in AI development was experienced, accompanied by the withdrawal of funds and the shift of interest to other rising technologies such as general-purpose computers.

Despite the significant differences in the historical backgrounds, several common factors can be identified in the emergence of the AI winters [32]. From a professional viewpoint, researchers had insufficient and biased evaluations about the pros and cons. Consequently, the predictions and the conclusions lacked rational, rigorous, practical, and comprehensive analysis. Furthermore, the news media acted as a catalyst in turning away the public focus from the real challenges that the researchers and the practitioners were facing [33]. Another explicit factor was the underdeveloped tools from the related disciplines, such as the unavailability of algorithmic approaches (mathematics), information resources (electrical engineering), and computing power and information infrastructure (computer science). Furthermore, the investigators were unaware of the risks and underestimated the required time for fundamental research. Large-scale investments and divestments were the triggers of the AI winters.

2.3 The rise of AI

Several vital factors facilitate the resurgence of AI.

First, the success of machine learning is one of the initial and base drives. Since late 1980s, a series of important ML theories and techniques have been proposed, many of which are necessary materials in today’s ML textbooks [34, 35]. Typical ones include decision tree, proposed by J. R. Quinlan in 1986; support vector machine, proposed by Vapnik and Cortes in 1995; Adaboost, proposed by Freund and Schapire in 1997; random forests, proposed by Breiman in 2001. As for the neural network-related advancement in this period, it was not until the article promoting back-propagation published in 1986 on Nature [36] that the research community paid good attention. The successes in ML and the other progresses discussed below become complementary factors on the path of AI resurgence.

Second, colossal amounts of data are available as the fuels for model training [37, 38]. The quality and the scale of the datasets are the determinants of the robust performance of prediction/classification [39, 40]. High-quality datasets built in typical and practical scenarios are encouraged to be shared in the research community. In this context, people devote efforts to designing and carrying out exhaustive experiments to collect real-world data, representing typical working conditions. In this process, the types of variables and how the data are measured constitute a part of the intellectual inputs by humans. As one of the key factors contributing to successful AI training, such experience has an unneglectable influence on the quality of the datasets in terms of comprehensiveness and typicality because it decides how well the data distribution of the constructed dataset can align with the ground truth. Publicly available datasets function as a baseline for defining AI challenges and for fair comparison of novel AI algorithms [41].

Third, the rise of AI is associated with a significant increase in computing power [42, 43]. The expansion in the physical scales of the AI-based machines enables them to deal with highly complicated tasks. In the meantime, it will inevitably slow down information transmission and processing and lead to plenty of energy consumption. A promising breakthrough is expected in terms of photon computing to overcome the bottleneck of today’s general practice based on electrons [44]. In addition to dealing with the maximum possible speed constrained by physical laws, numerous implementation challenges may cause significant delays. With advanced techniques of acceleration taking place in both hardware architecture [45] and software algorithms, the training tasks that usually take several months can be shortened to several days or even hours [46,47,48]. This is possible because different deep neural networks can be decomposed into several blocks where many required operations are the same. For example, a typical CNN (Convolutional Neural Network) block usually involves the processes of convolution, pooling, normalization, activation, and a fully connected layer [49]. In recent years, there have been fruitful outcomes reported about AI chips. Dedicated NPUs (Neural network Processing Units) have been developed and used on Kirin 970 chips and A11-A14 Bionic chips for iPhones. This makes it possible to perform real-time onboard (portable devices) tasks such as video processing for live broadcasts, fast biometric identification, and games with 3D rendering [50,51,52].

Fourth, several AI systems that outperformed top human players in contests and competitions impressed the public and rebuilt confidence in AI [53, 54]. In 2011, IBM’s AI system Watson won the championship at America’s popular quiz TV show “Jeopardy!”. Although it was not connected to the Internet, it had a four terabytes storage that covered 200 million pages of information, including the full version of Wikipedia. This is a remarkable achievement in implementing a machine with joint abilities ranging from information search to knowledge representation, automatic reasoning, and natural language processing. From 2016 to 2017, AlphaGo challenged professional go players and defeated over 60 human masters of go, including the world champion Lee Sedol and the worlds’ top-ranking player Ke Jie. The developer of AlphaGo, i.e., the team with Google’s DeepMind, later built a new AI system called AlphaGo Zero, which is entirely self-taught starting from scratch, with zero information from human’s game records. The system surpassed thousands of years of human knowledge with reinforcement learning and shockingly discovered novel strategies over a few days [55]. These examples show the dominating role of deep learning in the new AI age and demonstrate the phenomenal speed that AI can grow.

In Table 1, the critical events in AI development during the past 25 years are summarized.

Table 1 Important events in AI development: last 25 Years

3 Present

3.1 Widespread use of ML and then DL

At the very beginning of Google’s Machine learning handbook, it writes, “If you can build a simple rule-based system that does not require machine learning, do that.” It can be interpreted in two ways: first, as a type of heuristic algorithm, machine learning still cannot achieve guaranteed performance with full interpretability; and second, ML may have satisfactory performance (but still, not guaranteed performance) in specific scenarios that rule-based methods cannot. There are considerable challenges in figuring out all the rules and steps for highly complex problems. Non-AI schemes designed purely by human intelligence would lead to significant defects in the solutions or decision logic [57, 58]. Likewise, expert systems mimic humans’ way of thinking and rely too much on domain knowledge and human experience. Furthermore, the key to designing an expert system lies in utilizing expertise rather than methodology. Due to insufficient generalization ability and low robustness, these methods soon reached a bottleneck in widespread applications.

Rather than specifying the inputs and the human-defined algorithms to obtain the outputs, an alternative means to solve complex problems is to select the inputs and the expected outputs and to use machines to find the patterns between them automatically and thus “learn” the algorithms [59,60,61]. This is the general idea behind how (supervised) machine learning works. Machine Learning shows superiority in the following aspects. First, ML is good at learning from colossal amounts of structured data while humans are not, due to limited memory and brain capacity. Second, having “seen” an adequate number of training samples, the ML solutions obtained show good generalization capability. Third, each newly available data contributes to adjusting the input–output projections [62]. The learning process can be realized incrementally, which enables online and life-long learning [63]. Apart from supervised learning, there are also practical requirements for unsupervised learning. In many cases, the labels for the training data are unavailable due to the high workload and the need for domain experts. In one way, clustering-based approaches can automatically divide data into groups without specifying the exact mapping to the feature space. However, for estimation and prediction tasks, proper encoding/decoding procedures are needed. A similar idea applies to reinforcement learning. As a semi-supervised learning scheme, the subject learns continuously by taking actions and adjusting the acting rules (“policies”) based on the evaluation of the feedback (“rewards”). To be implementable for action evaluation in a limited period, the action space should usually be spanned by finite elements. A typical example is the application of collaborative robot control [64].

To deal with highly complex problems and discover insights from extensive data collections, deep structures are necessary [65,66,67]. In one aspect, by specifying the neural network model structure, a set of functions are parameterized. As a result, the upper limit of learnability has been decided. As an extreme example, any instance from the set of all linear functions can never describe a quadratic relationship regardless of what parameters are chosen. Therefore, it is promising that deep structures can capture more complex nonlinear relations between the inputs and the outputs than shallow and flat structures, given the same number of neutrons. [Some successful examples with deep structures: GoogLeNet (22 layers), AlphaGo (40 layers), etc.]. In another aspect, the learned weights determine which single function from the set can minimize the loss function, whereas how well they are learned determines the upper limit of performance (such as precision, robustness, confusion matrix, etc.). DL has shown strength in dealing with unstructured data, such as a piece of experience, a specific scenario, or skills transferred from other activities. The inputs can be anything that is properly encoded, e.g., an image, a video clip, a sound wave, texts, or readouts from various sensors. At the stage of problem formulation, rules are established to transform unstructured data into structured ones. The outputs can be indicators taken from a finite set of predefined categories (for classification tasks) or instances of variables that have an infinite number of values (for prediction tasks).

In terms of algorithmic development, several researchers (including us) have experimented with techniques from a wide spectrum, ranging from (pure) knowledge-driven solutions to data-driven approaches, more recently expanding the range to include the ML/DL-based techniques with supports of big data and knowledge-based work automation techniques. It can be stated that helpful knowledge is scattered and dispersed in a wide range of raw data or information fragmentation. The challenge is to mine the evidence hidden in hard-to-notice details, so logical reasoning/deduction and subtle attention to the critical aspects based on the subconscious or experience are needed on the ML/DL technical route. Regarding the wide acceptance of such a research route, mainstream research fields include computer vision and pattern recognition, robotics and automation, natural language processing (considered to be the holy grail of AI), knowledge discovery and data mining, strategy- and decision-making, and so forth.

3.2 Artificial narrow intelligence (ANI)

The application of AI techniques has penetrated our daily lives [68]. Most of the existing AI systems are dedicatedly designed for a limited range of predefined tasks. In this context, they are called artificial narrow intelligence (ANI) or weak AI. When we need to go to someplace but do not know the route, all we need is to specify the destination address to the Map apps on the smartphones, and it will provide a list of optimized path planning solutions that minimize the traveling time or cost. When we want to acquire information from the Internet, the search engine does most of the preliminary information collection job, including fine-targeted advertisement. Moreover, reliable real-time bioinformatics analysis has enabled novel means of personal identification, and as a result, novel means of payment based on fingerprints and face recognition become possible. Many ANI applications do not require globally optimal solutions, which are always challenging to resolve with formal approaches. Therefore, heuristic approaches have been widely studied as time-feasible alternatives. For safety–critical/security-critical systems, compromising optimality and achieving a theoretically guaranteed minimum performance by constraining the solution space is favorable. In recent years, quantum mechanics have been suggested to be a promising technology for cyber-security [69, 70]. In August 2016, Chinese scientists launched the first-ever quantum communication satellite. The maturation of the new generation of communication and computing techniques like quantum transmission and quantum computation may establish a new norm in the AI-embedded ecosystems.

In the domain of knowledge work automation, ANI systems have widely been applied to Q&A robots, serving banks, airline hotlines, and online consultation [71]. The key feature of these applications is a precise categorization of the domain knowledge and a minimal range of frequently asked questions. From a domain-specific perspective, ANI tools have been or are being developed for education, medication, transportation, marketing, and game production [72,73,74,75,76]. By contrast, ANI becomes insufficient for highly complex tasks such as the optimization problems for process industry and would lead to unsatisfactory performance in terms of decision-making. The main challenge lies in how to connect the engineering operations limited by the physical laws (objective factors) to human experience and to the socio-economic factors (subjective factors). This requires a set of ANI systems, working jointly for the same target. In this sense, the ability of ANI is constrained to problem solving rather than problem definition and resource exploitation, thus in essence, dependent on how well the problems can be defined by humans.

3.3 Autonomous systems

An autonomous system is a group of components forming a unified whole with the power of acting independently from the external environment [77, 78]. Today, autonomous vehicles and autonomous drones have become typical applications of such. They are targeted to perceive the surroundings, get aware of the real-time status, make plans and decisions, and take actions, all of which are achieved by onboard devices and/or with the aid of resources on the cloud and the edge side, with little human intervention. It is not difficult to find that these behaviors are driven by human-like problem solving and learning abilities, which are the backbones for any autonomous system. In case of emergency, there can be human instruction override for vehicles and drones. However, it is not always feasible in extreme environments such as for deep space and deep-sea exploration. The time delay of information transmission becomes significant, considering the long distances. Therefore, a robust autonomous system needs self-diagnosis and self-restoration capabilities [79].

In another aspect, autonomous learning is necessary for generic AI systems. The learning ability determines how in the long-term the autonomous system self-evolves. On the opposite side of natural beings, most of the engineered autonomous systems have fixed functionality sets that usually cannot be expanded once constructed. It should be noted that not all the learning-based approaches enable autonomous learning abilities. The point lies in whether they are used for offline model training only or for online updates as well. The latter is also referred to as life-long learning. Training with positive examples, commonly understood as teaching and practicing, is a means for learning in the narrow sense. Beyond it, ability migration (transfer learning), critical thinking (adversarial training), imitation (imitation learning), and creativity (heuristic design) have begun to be proven as alternative effective technical routes for AI design [80, 81].

Knowledge, information, and data compose the fuel of the learners. Knowledge describes regular patterns and abstract facts that human understands. Therefore, it is usually semantic and embedded in books and research articles. To be interpretable and useful for machines, it needs to be modelled, transformed, and generated, which corresponds to the tasks of knowledge representation, knowledge reasoning, and knowledge discovery. Information is transmissible content that has timeliness, such as news and contextual data flow. The backbone of the contemporary industrial revolution—the cyber-physical systems (CPSs)—bridges the virtual space and physical space by establishing automatic flow of information and knowledge. In such context, the entities of autonomous systems can maintain digital replicas (digital twins, DT) that evolve along with real-time flow of information/data. The entities and the corresponding digital replicas form a conjugate pair to deal with the real-world complexity and uncertainty [82, 83]. In the future, the digital replica may eventually become a feature subset of the physical entities. Furthermore, subsystems and submodules in the autonomous system can be linked to the knowledgebase and information flow of the entire fleet of assets, combining workflows in a novel closed-loop manner. It is worth to note that the R&D activities on closed-loop empowerment systems based on CPS/DT are still ongoing [84]. A few industrial examples are introduced in [83] and the references therein.

3.4 Symbiotic systems and human-in-the-loop systems

While the AI systems of today have outperformed humans only in a limited range of highly specific tasks, there are still risks and concerns about a future when humans must co-exist with novel forms of AI systems. It may not be too early to start thinking about the role of and the relationship between the human and the machines with intelligence.

Human-AI symbiotes are considered as a possible form of harmonic co-existence of human beings and AI. With the aid of the brain-machine interfaces (BMIs), AI acts as a kind of digital expansion of the central nervous systems. In such context, the boundary of the biological and the artificial minds is blurred. There may also be drastic changes in the way how we communicate with each other and interact with machines, as information exchange is more efficient via nerve (bio)electric signals alone than using the nerve signals to drive the muscles to generate different facial expressions, voices, postures, or gestures. To this end, there need to be strong connections between the AI systems by human minds and vice versa. Today, fundamental works are being carried out.

According to the forms of information representation, machines can receive human instructions by means of bio-signal recognition and behavior recognition. The former takes EEG (electroencephalography) signals, thermal signals, and even chemical signals and interprets them using signal processing/pattern recognition approaches. The latter takes optical signals and audio signals and interprets them with machine vision techniques and speech recognition techniques. Beyond all these, people today spend much time in living a digital life and are equipped with numerous electronic devices. The information uploaded to various apps is useful for creating digital profiles. For instance, personal social media networks characterize daily behavior and interpersonal relationship while the real-time public medical data can reflect the epidemic situation [85,86,87]. Smart phones, smart glasses, tablets, and so forth act as auxiliary interfaces to monitor and influence human behavior [88].

Regarding the implementation solutions that bridge human minds and AI, further advancements in BMI (or brain-computer interface, BCI) technologies are necessary [89]. The main tasks include capturing, amplifying, analyzing, and interpreting the brain wave signals, and based on which to control the connected machines/computers. The success necessitates interdisciplinary research, covering neuroscience, data science (signal processing and machine learning), as well as communication and sensing technologies [90]. In the laboratory environment, researchers have managed to achieve non-invasive implementation for writing and spelling [91, 92], prosthetic/exoskeleton control [93], and for general-purpose communication [94]. By contrast, the invasive BMI techniques with electrodes in the brain can achieve better performance in control precision and are more suitable for real-time applications. Additional advancements in the study of brain functions and the development of nanomaterial and implant surgery techniques are necessary for implementations on a broader scale [95, 96]. It is also worth noting that the state-of-the-art practice is still limited to open-loop control and unidirectional information flow, from humans’ brains to computers. This is more due to ethical concerns than technical feasibility, as experiments with animals have demonstrated the effectiveness of artificial stimuli to control visual and motion perceptions [97].

With the establishment of effective links, demand-driven application scenarios that benefit from the symbiotic ecosystem have started to surface. Over the past decade, human-in-the-loop (HITL) systems have become a mainstream research subject that addresses the issues regarding modeling and predicting human-related factors so that they can be treated as a part of closed-loop systems. On the opposite side of the industrial practice that treats human participants as unfavorable factors [98] and pursues unmanned working conditions with high automation degrees, the main challenge lies in dealing with the uncertainties introduced by the human participants. While it is still impractical to interpret human thinking in real-time, AI may identify the intention by observing our status and behavior, by learning from the changes made to the environment with which we are interacting, or by learning from how we react to previous cases.

HITL technology plays an essential role in training, design, operation, and optimization [99,100,101]. For instance, pilots are taught on flight training simulators (FTS) where the flight environment is adaptively adjusted and rendered by AI. Similarly, surgeons and dentists can carry out virtual surgeries as a part of training or strategy-making before actual medical activities. As for designation, AI creates groups of (digital) prototypes as candidate product solutions and constantly interacts with the designers or the customers to meet personalized requirements, in a parallel, case-specific, and efficient manner. In industrial assembly processes, smart robots or those known as collaborative robots help human operators (and vice versa) to manipulate tools and give warnings and suggestions [102, 103]. An autonomous driving system and advanced driving assistant system (ADAS) monitor the traffic condition and free the human drivers from intensive attention. Furthermore, with HITL optimization techniques, AI systems can reduce the conservativeness in making recommendations. They focus on the best options and are not concerned about outrageous ones because the human judges will filter them out. Based on these examples, it can be summarized that both humans and AI interact with the environment, perceiving the states and making corresponding decisions and changes. From the perspective of systems and control, there are two controllers in the same closed-loop, driven by the biological and the artificial brains. In this sense, to ensure overall stability and achieve good performance, understanding and predicting human actions is no less important than designing explainable AI. The attractive part is that humans excel at leveraging unstructured data while machines outperform at analyzing structured data, forming a natural complement.

In the long term, there are several advantages if human-AI symbiotic systems become more communal. In one aspect, symbiotic systems may ensure that the intelligence level is improved in a balanced and controllable manner. No individuals or some groups of individuals hold monopolistic information resources in all domains so that AI dictators or human dictators powered by super-AI might be avoided. In the meantime, the competence in humans and AI is concerted to a collaborative relationship [104]. Additionally, the openness and transparency of AI development can be maintained over the process of establishing a symbiotic ecosystem. It relies on interpretable operation mechanisms and explainable AI (XAI), helping us to understand from the AI’s perspective how a problem/task is defined, what information sources are needed, how to exploit knowledge from big data, what factors are taken into consideration, which solution is adopted to solve the problem, and eventually, how a decision or a prediction is made. Furthermore, the in-depth integration with AI systems may unleash the potential of the human brain and elevate overall social efficiency. The limitations in information availability and computing capability might be eliminated. Instead, the bottleneck would lie in creativity, horizons, and non-renewable resources.

3.5 Driving forces of AI development

An increase in the bulk of subjects facilitates the AI R&D ecosystem and the related ecology. In the short term, the development of AI is to a large extent driven by dominant stakeholders, especially with support from major investments and governmental strategic planning [3, 68]. The key lies in whether the novel technologies can be grounded swiftly, contributing to economic harvest and regional/social stability. On the other hand, learning from the historical experience of “AI winters”, it is still uncertain and indirect that whether, in the mid to long term, AI development will converge to the basic demands of the general public, for improving the quality of life, the efficiency of working, and the degree of happiness.

The healthcare industry covers all the attributes from patients to doctors, from platform to end users, from groups to individuals [105]. The industrial chain is constituted of medicinal materials, medicine, medical instruments, medical institutions, medical information, and medical examination. Today, healthcare applications are acting as an active driving force of AI development. Colossal amounts of medical data are collected every day. The complexity of medical information lies in the high dimensionality, nonlinearity, and sometimes strong dynamics. In this context, the AI-powered diagnosis systems need to be well-trained to achieve an acceptable performance, which usually consists of a list of indicators such as those characterizing the prediction accuracy and the confusion matrix that characterizes the classification performance.

AI plays a key role in establishing novel norms of modern hospitals towards one-stop medical services and new business models. In the future, the allocation of limited medical resources will still be the major issue. In the cases where the physicians require reliable intelligent preprocessing tools to reduce their workload in dealing with raw data, AI may act as a good assistant. For instance, AI helps to markup suspicious lesion areas on optical images [106], X-ray images [107], or magnetic resonance images [108,109,110], and can analyze patients’ speech to diagnose Alzheimer’s disease [111]. Researchers also put hope on shortening the cycle of new drug discovery with the ability of AI in learning from vast volumes of medical literature and datasets [112,113,114,115]. Furthermore, embedding AI systems in portable devices will help reduce the shortage of medical resources in remote areas and reduce social inequality. Motivated by these practical demands, there are huge driving forces in the AI-related R&D activities, the production, and the applications of AI-embedded devices, as well as the construction of intelligent platforms and infrastructure. Currently, AI has improved the medical automation, but is mainly used to aid diagnosis. Using AI alone as the backbone of healthcare is very dangerous. In 2015, IBM launched Watson Health, a business unit that uses AI to solve critical problems in healthcare. Their algorithms prescribe drugs with severe side effects and even death to a cancer patient.

Another driving force behind the advancement of AI is related to intelligent transportation applications. Research on autonomous vehicles has been an extremely popular direction. The tasks cover all aspects from perception to decision making, which are all considered to be in need of intelligence [116]. Delightfully, what were considered as challenging problems have now applicable approaches, such as localization, object recognition, and tracking, movement prediction, and path planning [117]. Although these are ANI tasks, the high demands in dealing with the complexity and uncertainty in the real-world environment and the requirements of real-time and onboard implementation are boosting the AI industry. From the viewpoint of commercialization, it is hard to tell whether the traditional automotive companies or the Internet enterprises dedicated to intelligent solutions will be dominating the innovation and the market of smart automobiles. From another perspective, vehicular cyber-physical systems are being studied to connect the information islands of each individual transportation participants [118]. The models of the real-time regional traffic network can be created with the aid of the cloud computing resources and based on the publicly available information collected from distributed agents (such as vehicle sensors, pedestrians’ smart phones, and roadside infrastructure). On this basis, intelligent algorithms are required to play the role from a macroscopic perspective, for instance, in the prediction and optimal diversion of the traffic. Today, policies are put forward to provide support for promoting intelligent transportation applications and to provide regulations for testing activities and open test roads.

In addition to the above, wide categories of simulation applications are relying on and in turn driving the development of AI. The typical application scenarios range from manufacturing to education, from smart city to entertainment, and many more beyond [119,120,121]. The enabling technologies include but are not limited to Internet of things, extended reality (XR), digital twin, robots, smart grids, space technologies, and so forth. In comparison with the AI winters, the historical period of the twenty-first century is full of opportunities, grasping which will lead us to a prosperous future where AI can create values and benefit human lives to the greatest extent.

4 Future

4.1 Understanding and cognition

While AI systems are created by humans, they may evolve in a much faster speed than anticipated and develop high complexity in structure, behavior, and decisions that humans can hardly follow. In the long term, AI systems and human beings need to establish common cognitions in terms of the outlook on the world, life, and values. However, to date, engineered systems cannot be modelled to precisely replicate the real-world entities and the physical environment. That is why a bridge to connect the cyber and the physical worlds is being built in the Industry 4.0 era, hoping to feed the engineered digital replicas with online real-time measurements (for synchronization, correction, and update) and further why Kalman filter is regarded as a cornerstone technique in the domains of sensing/perception/control due to its ability to achieve optimal estimation by compromising measurements (with inevitable error) and model outputs [122, 123].

As discussed in Sect. 3.4, AI will play important roles in symbiotic systems in the future. In the narrow sense, human–machine symbiotes would bring benefits in terms of overcoming disabilities or making humans stronger (also referred to as “human augmentation”) and getting more knowledgeable. Especially, in an age of information explosion, it has become increasingly difficult to search through the available resources. Yet, a type of cognitive digital twin (CDT) may be very helpful to collect and pre-screen the related data, and then to perform tasks on behalf of human. An illustrative example of “Google Duplex” was given in Table 1 where AI made phone calls to book a restaurant. With the grounded applications of natural language processing and computer vision, CDT may execute users’ much complex demands. Customized optimization is feasible by integrating the knowledge structure of an individual, and it is actually an ongoing research which is usually referred to as personal CDT. In turn, the knowledge structure and the practical skills of humans could be improved in a targeted manner by using courses suggested by AI (just like what was envisioned in the movie The Matrix). Similarly, in the future, scientists could manage the explosion of knowledge better with the aid of machine reading to discover new knowledge that they need most from hundreds of thousands of publications each year.

We see that there are plenty of AI algorithms inspired by the nature, such as the grey wolf algorithm, ant colony optimization, particle swarm, genetic algorithm, and so forth. Neural networks are inspired by biological nerve systems. Nevertheless, the models have been much simplified, and as a result, the mechanisms of learning and acting become different and divergent that the AI systems can hardly understand the nature and the socio-economic worlds like we do. In the future, with super large-scale neural networks such as Wu Dao 2.0 and GPT-3 becoming common, it may be expected that multimodality across the major branches of AI such as computer vision and natural language processing will end up unified [124].

On the other hand, can humans understand the behavior of AI systems and trust their decisions? As aforementioned, XAI (or interpretable AI) is the key towards extensive and generic application of AI systems for reliability-critical tasks. Theoretical guarantee is needed in terms of why it works and what the target is [125,126,127,128].

The research on XAI can take two routes. First, AI systems are designed as equivalent or approximate solutions to the conventional non-AI solutions with clear physical meanings. In this way, it is clear what the design target is and what exactly is to be learned, and the heavy dependency on trial-and-error during the design phase can be reduced by learning from data. For instance, the “kernel trick” is usually used to simplify the computation of the inner product of nonlinear functions which map low-dimensional inseparable variables to a high-dimensional separable feature space. The problem is that there has been no consensus reached on how to optimally select the type and how to set the parameters of the kernel function, and the online computational load is quite high. As a result, although the performance is satisfactory and sometimes superior, kernel-based methods are facing great challenges in practical real-time applications. Promisingly, Deep Neural Network (DNN) training enables automating the process of manual tuning of critical parameters in these conventional solutions. The theoretical support is provided by Mercer’s Theorem, which proves that any semi-positive function is a valid candidate. Therefore, by using DNN, it can achieve a learnable and faster realization of the kernel function where the explicit form and the coefficients of the nonlinear mapping function can be obtained [129, 130]. Nevertheless, it is worth noting that the learned functions based on DNN vary greatly during each training epoch.

Second, efforts are dedicated to the study of the characteristics of the AI models to derive human-understandable logic. This can be straightforwardly achieved for knowledge-based AI systems. Otherwise, XAI needs to answer questions including (i) which factors/data contributes most to the results and (ii) how to interpret results using linguistic expressions. As for question (i), just-in-time learning, locally weighted projection regression, and K-nearest neighboring-based methods are typical examples that link decision-making directly to the related data/local models [131,132,133,134,135]. As for question (ii), temporal logic-based methods are promising solutions [136]. Furthermore, as for the “black-box/grey-box” models, the work is to shed light on the characteristics of the sub-blocks of the deep structures [137]. Ablation experiments analyze the necessity of each module and the influence on the overall performance [138]. In addition to the theoretical perspective, explainability also needs to be discussed along with the specific application background. Some tasks require robust performance against data changes, e.g., having good tolerance to disturbance, noise, missing values, and outliers, while the others require high sensitivity to abnormal changes [139,140,141].

If confusing advice is provided by an AI and unfortunately it cannot be fully interpreted, shall we trust it? For many decision-making processes, there is no right or wrong, but a series of strategic moves are favorable in the long run. However, it is very subtle what factors are taken into consideration and how much weight is assigned to each factor. For instance, for a dying patient, shall we take every chance to extend his/her life regardless of the suffering or turn to terminal care. In addition to facts and targets, the implicit and uncertain factors may play a decisive role. The point is that we may always have a question mark in our hearts when dealing with human-involved issues because machines can hardly cognize and have emotion as we do. Even though the decisions can be interpreted by following the second XAI research route, for such cases, AI systems can be only used as a supervisory tool. “All models are wrong, but some are useful.” In this sense, to be compatible with basic laws and regulations, we may trust the AI behaviors, but the responsibility shall always be linked to a person.

4.2 Artificial general intelligence (AGI)

ANI systems are characterized by a limited range of well-defined intelligence-related functionalities. The design is therefore also highly specialized. In such context, the advancement in the ANI techniques is accompanied by the awareness of the significance in the correct formulation (well-posedness) of the problem and proper definition of targets (“be careful of what we wish for”). Nevertheless, it will not always be like this. The relationship between AI and humans is likely to change in the long term. As shown in Fig. 3, at stage I of AI, i.e., ANI (the current era), humans create AI systems and use them as tools. At stage II of AI, i.e., Artificial General Intelligence (AGI), humans interact with AI systems and treat them as omnipotent helpers. At stage III, named Artificial Super-Intelligence (ASI), AI may be able to create better AI systems than humans, in which sense they may be considered a novel form of life. The coming of stage III is also considered as a singularity in the development of AI (see Sect. 4.3).

Fig. 3
figure 3

Looking to the long-term future

As pointed out by Professor Stuart Russell, the field has operated for over 50 years on one simple assumption: the more intelligent, the better [142]. Yet, unconstrained success brings huge risks and huge benefits. Beyond the degree of intelligence that AI systems can reach, the discussion of which belongs to the technical dimension in the development of AI, more attention is needed on the socioeconomic factors. In fact, new challenges are already here today, such as trustworthiness, privacy-preserving, cyber-security, social equality, policy and law, education, and employment, and so forth.

In our society, each occupation corresponds to a group of people and many more related to them. Today, the skills required by the train or the subway operators have changed. In the past, a considerable amount of time was devoted to training on how to manipulate vehicles, whereas nowadays the focus is on emergency events when human decisions override computer instructions. Likewise, in the aviation industry, it is a common practice for pilots to take over control during take-offs and landings because there are so many uncertainties that require human judgment. Not doing so may lead to not only safety risks but also critical ethical and law issues in case of accidents. Such risks are currently unacceptable because AI has not yet gained the trust that it can act more rationally and judiciously under all circumstances. It is also important to be able to know on what factors a decision was based upon so that a correct interpretation of it can be made. Such issues have formed a subtopic of research, called Explainable AI (XAI). The goal is to understand how AI functions, such as what factors and logic are adopted when outputting a decision. This would contribute significantly to the establishment of a formal science (compared against heuristic learning) that would enable a methodology for systematic analysis and theoretically guaranteed performance. The subject has still not matured enough in policy and law aspects to enable us to reach a justifiable decision on who should take the responsibility for errors and failures related to AI systems [143].

4.3 Singularity and artificial super intelligence (ASI)

It was about 15 years ago, in 2005, that a non-fiction book, titled “Singularity is Near” was published and soon became very popular [144]. Its author: the futurist Ray Kurzweil, defined technological singularity as “a time when machines will have and be able to make other machines with intelligence comparable to human beings.” His prediction at that time was that this would happen by 2045, i.e., about 25 years later. In more general terms, singularity is a status that separates quantitative and qualitative changes in the process of AI development. In the context of this paper, singularity refers to the time when AI has the ability to design AGI systems better than humans. In addition to its supercomputing ability, AI outperforms mankind in learning, prediction, creativity, productivity, decision-making, organization, management, as well as survival. If the narrow understanding of learning is limited to the algorithmic/software level (which is the case for today), we need to note that post-singularity AI systems may have the ability to reprogram the firmware and even to design hardware by itself, starting from scratch. This is a domain (rebuild our brains) that humans can hardly reach at present and it is uncertain whether human can achieve it in the future. Besides, post-singularity AI can be smart enough to guarantee the supply of resources while figuring out a way for sustainable development. From then onwards, the evolutionary process of AI is dominated by itself, towards an unpredictable direction with an unimaginable speed.

Singularity is a subtle borderline, crossing which the AI systems become out of human control. It is well acknowledged that beings with the highest level of intelligence live at the top of the food chain. Nevertheless, although there is no evidence that silicon-based being can exist at all, there is neither proof that silicon cannot develop novel forms of intelligence as carbon-based creatures. With generation-by-generation iterations, there are chances that the capability of AI exceeds all humans in all aspects, which constitutes the definition of artificial super intelligence (ASI). ASI learn from experience, adapt to new situations, handle abstract concepts, and use knowledge to manipulate the surrounding environment. It can do everything (and does better) that humans can do—logic reasoning, thinking, creativity, strategy, learning a skill on its own, identifying knowledge resources on its own. Today, there is hardly any consensus reached whether such singularity exists and whether AGI is in the foreseeable future. Nevertheless, if there would be evidence of such, what is needed to prepare for it? We need to consider the means to steer the wheel in such a way that the probability of stepping into an unexpected and unfortunate future is minimized.

5 Concluding remarks

Profound technological changes have taken place around us during the last 2 decades, supported by disruptive advances both on the software and the hardware sides. The dominant feature of the changes is the integration of the virtual world with the physical world through the Internet of Things (IoT). The most recent development is the radical paradigm shift from “connected things” to “connected intelligence”. Table 1 tries to emphasize the accelerating nature of technological developments by listing the important milestones in AI during the last 25 years.

Throughout the history of scientific and technological development, the emergence of any scientific and technological revolution is not only reflected in technology, but also changes in human social structure, moral constraints, laws, and education. In this article, at the critical point of AI technological revolution, a brief history of AI development is presented, the advancement and the state-of-the-art of the AI discipline are reviewed, and a perspective on the future of AI is offered. AI integrates with a variety of disciplines, plays an important role in the history and might impact the future of humanity. Today, AI is more available now than ever. Products and services driven by AI technologies have widely emerged around us and have led to a profound impact on daily life. Predictably, AI will act as the one of the backbones of new technological revolutions. Nevertheless, several troughs may once again take place if the technologies/applications cannot be grounded swiftly according to the expectations from the dominant stakeholders (inventors rather than the general public). The article is intended to help the readers to better understand AI, accept AI, and think more deeply about AI. With the long-lasting curiosity and wisdom, we are likely to witness the beauty of a world with the novel AI ecology. As Ophelia stated in Shakespeare’s play Hamlet, “we know what we are, but we know not what we may become.”