Orthodontic Treatment Planning based on Artificial Neural Networks

Li, Peilin; Kong, Deyu; Tang, Tian; Su, Di; Yang, Pu; Wang, Huixia; Zhao, Zhihe; Liu, Yang

doi:10.1038/s41598-018-38439-w

Download PDF

Article
Open access
Published: 14 February 2019

Orthodontic Treatment Planning based on Artificial Neural Networks

Peilin Li¹,
Deyu Kong²,
Tian Tang¹,
Di Su¹,
Pu Yang¹,
Huixia Wang¹,
Zhihe Zhao¹ &
…
Yang Liu²

Scientific Reports volume 9, Article number: 2037 (2019) Cite this article

10k Accesses
71 Citations
16 Altmetric
Metrics details

Subjects

Abstract

In this study, multilayer perceptron artificial neural networks are used to predict orthodontic treatment plans, including the determination of extraction-nonextraction, extraction patterns, and anchorage patterns. The neural network can output the feasibilities of several applicable treatment plans, offering orthodontists flexibility in making decisions. The neural network models show an accuracy of 94.0% for extraction-nonextraction prediction, with an area under the curve (AUC) of 0.982, a sensitivity of 94.6%, and a specificity of 93.8%. The accuracies of the extraction patterns and anchorage patterns are 84.2% and 92.8%, respectively. The most important features for prediction of the neural networks are “crowding, upper arch” “ANB” and “curve of Spee”. For handling discrete input features with missing data, the average value method has a better complement performance than the k-nearest neighbors (k-NN) method; for handling continuous features with missing data, k-NN performs better than the other methods most of the time. These results indicate that the proposed method based on artificial neural networks can provide good guidance for orthodontic treatment planning for less-experienced orthodontists.

Different bimodal neuromodulation settings reduce tinnitus symptoms in a large randomized trial

Article Open access 30 June 2022

Efficient and accurate identification of ear diseases using an ensemble deep learning model

Article Open access 25 May 2021

Speaking without vocal folds using a machine-learning-assisted wearable sensing-actuation system

Article Open access 12 March 2024

Introduction

Malocclusion is a common disease that impairs occlusal function, increases the incidence of caries, causes psychological discomfort, endangers health and reduces the quality of life^1,2,3. An epidemiologic survey in America showed that 57% to 59% of each racial group has at least some degree of orthodontic treatment need⁴. The Health Policy Institute of the American Dental Association reported that 33% of young adults avoid smiling due to the condition of their mouth and teeth, and 82% of adults believe that the good appearance of the mouth and teeth can help them advance in life⁵. To achieve satisfactory orthodontic treatment effects, treatment planning must be carefully performed before the treatment process begins⁶. Comprehensive and deliberate evaluation of many factors makes treatment planning a complex process without any objective patterns, and heavily depends on the subjective judgment of the orthodontists.

Researchers have attempted to make orthodontic treatment planning procedures more objective by using some prediction methods. Rule-based expert systems (RBESs) were used to help orthodontic students and inexperienced practitioners with problem-solving and decision-making⁷. RBESs use formulated rules to construct a decision tree but suffer from considerable knowledge lost in the rule determination. To overcome the limitations of RBESs, case-based expert systems (CBESs) have been developed. CBESs acquire new knowledge by analyzing and taking in new cases, thereby acquiring better indexing features⁷. The difficulty of using CBES lies in finding an exact case that matches the new case; thus, some new cases have to be properly modified to be identified. A software that combined RBES and CBES was proposed in Noroozi’s work⁸, and the application of fuzzy logic made it more practical. Takada⁹ and Yagi¹⁰ proposed a CBES that used a k-nearest neighbors (k-NN) algorithm to perform classification in tooth-extraction decisions. However, the k-NN algorithm is a type of instance-based learning that is sensitive to the local structure of data and requires an increasing number of calculations as the number of cases increases. The artificial neural network (ANN) has the advantage of excavating features from massive medical data^11,12, and the past decade has witnessed the rapid development of this approach. It has also been applied to determine necessary tooth extraction¹³ and extraction patterns (specific teeth to be removed)¹⁴ in orthodontic treatment planning.

It is important to note that different orthodontists can have markedly different plans for a specific case¹⁵. Considerable variety can occur particularly in the decision of which teeth to extract¹⁶. In addition to outputting a recommended treatment plan, an ANN that can output the feasibilities of multiple extraction options will allow orthodontists greater flexibility. Additionally, anchorage (resistance to unwanted tooth movement) is another important factor that should be considered when making plans⁶. In extraction cases using maximum anchorage, the early use of appropriate means to reinforce anchorage must be taken into account in the beginning. Related data are not always available in the actual application environment of ANNs^17,18,19. ANN models cannot predict missing data; and thus a case with incomplete data may not be predicted by the models^20,21,22. Providing methods that can handle missing data may make the model more applicable. Statistical approaches, such as imputation with average value or normal value, and k-NN imputation method have been intensively studied^22,23. However, the comparison of different methods have showed different results and there is not a unique solution that can obtain best results in each neural network^18,21,24. In this study, the traditional statistical approaches (average value method, frequent value method, specific value method and median value method) and the k-NN method are used to evaluate which method may be the best for increasing the accuracy.

The ANN in this study can promptly output both a recommended plan and feasible probabilities of several alternative options. The plan covers the most crucial concerns of orthodontists for most cases, including the extraction-nonextraction decision, extraction patterns and anchorage patterns (whether to use maximum anchorage to retract anterior teeth). We further calculated the relative contribution of features in each network model with the partial derivatives (PaD) method²⁵ and, for the first time, investigated the effect of several complementary methods on handling missing data in orthodontic treatment planning.

Results

Orthodontic treatment planning with ANNs

Three neural networks are trained with 302 cases from the Department of Orthodontics, West China Hospital of Stomatology. As shown in Fig. 1a, the first neural network determines whether a patient needs tooth extraction. If the patient needs extraction, the second and third neural networks then predict the specific extraction pattern and anchorage pattern, respectively. The network to determine extraction patterns, for example, is a three-layer fully connected multilayer perceptron (MLP), as shown in Fig. 1b, which consists of 24 input nodes, 10 hidden nodes, and 4 output nodes. The other two neural networks share the same model structure as this network but have different numbers of output nodes. The extraction-nonextraction neural network has 2 output nodes, and the anchorage patterns neural network has 3. The trained neural network models are provided in the Supplementary Information, together with the demonstration of the treatment planning process.

The accuracies of the ANNs

As illustrated in Fig. 2, the receiver operating characteristic (ROC) curve shows the performance of the ANN on the extraction decision. The model yields an area under the curve (AUC) of 0.982 (95% CI 0.968–0.995). The closer the point on the ROC curve is to the upper left corner, the higher the accuracy of the model, and the point closest to the upper left corner is the best cutoff value with the least error. The optimum diagnostic cutoff value of this model is 0.692, with which the model reaches a sensitivity of 94.6% (95% CI 0.894–0.964) and a specificity of 93.8% (95% CI 0.870–0.984). If the prediction probability of a case for extraction is greater than 0.692, it will be diagnosed as an extraction case and passed to the other two models to determine the extraction pattern and anchorage pattern. Figure 3 shows the predictive accuracies of the ANNs. The accuracy of the extraction-nonextraction decision-making is 94.0%, and the accuracies of the learning set, validation set, and test set are 94.0%, 95.0%, and 93.3%, respectively. The predictive accuracy of the extraction patterns is 83.3%, and the accuracies of the learning set, validation set, and test set are 83.6%, 84.1%, and 81.8%, respectively. The overall accuracy of the anchorage patterns is 92.8%, and the accuracies of the learning set, validation set, and test set are 93.3%, 90.9%, and 93.2%, respectively. The decision-making of the extraction patterns is the most complicated part of the treatment planning, and different doctors may use different extraction patterns¹⁶. This explains why its prediction accuracy is lower than the other two parts of the treatment planning, and this gap is comparable to the results of another study¹⁴. Considering the subjectivity of decision-making on extraction patterns, the model offers several practicable alternatives for doctors to choose, which makes it more applicable.

Relative contribution of features for planning decisions

In clinical practice, doctors may not always have access to all the required data used by ANNs. Therefore, investigating the contribution of all features for each decision part will be of practical importance. We used the PaD method²⁵ to calculate the features’ relative contributions and ranked them in order. The results are illustrated in Table 1. With respect to extraction decision-making, features “crowding, upper arch” “crowding, lower arch” and “U1-NA°” are the three that contributed most. “ANB” “overbite” and “lip incompetence” are the three features that are most related to the prediction of extraction patterns. The three most important features for anchorage pattern determination by the model are the “curve of Spee”, “nasolabial angle” and “UL-EP”. The results demonstrate that these features are selected or “thought” as important by the models when making decisions. When the models make different predictions, they treat different features as the most important.

Table 1 Rank of the relative contribution of every feature.

Full size table

Comparison of different complement methods

In the present study, we took a step further to investigate when the data of the most important features are missing, whether the impact of data loss on accuracies can be reduced or minimized by complement methods^21,23,24,26. The complement effects are displayed in Table 2. The results show that the average value method has better performance than k-NN when dealing with discrete features regarding “lip incompetence” and “nasolabial angle”. The accuracies of the four traditional methods (average value, frequent value, specified value and median value methods) are consistent for processing the “nasolabial angle”. When dealing with continuous variables, k-NN performs better than the other methods, except for “crowding, lower arch”. With respect to the three features that the models “consider” to make the largest contribution to the treatment decisions, k-NN has the best complement performance. The effect of 2-k-NN is similar to that of 3-k-NN, in general.

Table 2 Complement effects of different methods on important features with missing data.

Full size table

Discussion

In this study, we propose an MLP-based classifier to analyze patients’ medical records and output both a recommended treatment plan and the feasibilities of various treatment plans regarding the aspects of extraction vs. nonextraction, extraction patterns and anchorage patterns. The feature importance is calculated and ranked, and we compared the effects of different complement methods. Since there can be several feasible treatment plans regarding a certain case, the output probabilities of extraction and anchorage patterns offer users considerable flexibility as well as guidance. The users can review the recommended plan, compare various treatment options, take other aspects into account, and finally, develop a standardized, accurate, and effective treatment plan. Figure 4 illustrates the clinical application process of the ANNs. An example patient was used for demonstration. Informed consent to publish the information and images in an online open-access publication was obtained, and the example patient’s medical data (patient A) can be found in the Supplementary Note 2.

Our neural network models have better performance compared with previous prediction methods. The predictive accuracy of extraction reached 94.0%, higher than the other prediction models^13,27, and made a one percent improvement on the 93% accuracy of Jung’s study¹⁴. The AUC of 0.9815 is also higher than the result of 0.904 in a previous study based on the CBES and the k-NN algorithm⁹. For the most complex part of extraction pattern prediction, the model reached 84.2% accuracy, similar to the result of Jung’s model¹⁴, which is also based on a neural network. In addition, this study is the first to make a prediction of the use of maximum anchorage by an ANN, to the best of our knowledge. The accuracy was 92.8%, suggesting the potential of ANN in assisting orthodontists in making more detailed treatment plans.

We used the PaD method to study the contribution of features to the outcome and ranked them in order. Xie¹³ used the “weights” method in a previous study and investigated the connection strengths of each neuron in the input layer with each neuron in the hidden layer to represent the contribution of every input index. They used different input indexes and different methods to calculate the contribution, and they found that “anterior teeth uncovered by incompetent lips” and “IMPA” were the two indexes that presented the largest contributions to the extraction decision vs. nonextraction. The “weights” method allows for a good classification of the input features but lacks stability²⁵. The PaD method presents more complete results and makes a more severe discrimination between minor and major contributing variables in comparison to the “weights”^25,28.

It is also the first time to investigate the effect of complement methods in similar studies. Our results suggested that the average value method outperformed k-NN when dealing with discrete variables, but k-NN performed better for the continuous variable. Meanwhile, k-NN is also a better choice when the data of the most important feature for each output decision is missing. This research contributes to the handling of missing data and can make the model more applicable.

It takes a relatively long time for orthodontists to accumulate experience. Doctors with less experience often require consultation with experts. Since medical developments are uneven and severely affected by economic conditions, expert consultation is especially deficient in areas with poor medical conditions. The proposed ANN system can not only assist less-experienced orthodontists and students in learning but also help patients obtain a clear understanding of their treatment plans.

Methods

Cases Collection

A total of 302 patients who received orthodontic treatment at the Department of Orthodontics, West China Hospital of Stomatology in Chengdu, China, from 2014 to 2018 were included in this study. The inclusion criteria were fixed labial appliance patients with full permanent dentition (except for second or third molars) without functional appliance treatment or orthognathic surgery. Their medical records before orthodontic treatment were collected, including demographic information, extraoral photos, intraoral photos, pretreatment dental casts and lateral cephalometric measurements^6,29. Twenty-four commonly used feature variables were extracted from these clinical records as input features. The input features were preprocessed to ensure that all of them were quantified before being used for model training. Nonquantitative data were converted into numerical values by the encoding method. Supplementary Table S3 shows the detailed features used in the ANNs and how the nonquantitative data were encoded. All treatment planning was carefully performed by Dr. Zhao and Dr. Tang, who are both orthodontic specialists and have 26 and 12 years of clinical work experience, respectively. This study was approved by the West China Hospital of Stomatology Institutional Review Board (WCHSIRB-D-2018-094). Informed consent was obtained from all participants or their legal guardians. Informed consent for publication of the medical records of four example patients in an online open-access publication was also obtained. All experiments were performed in accordance with relevant guidelines and regulations.

The composition of the cases and datasets

Among the total population, 222 persons were extraction cases, accounting for 73.5%, and the other 80 persons were nonextraction cases, accounting for 26.5%. The tooth extraction patterns were divided into four types: maxillary and mandibular first premolar extraction (4444), maxillary first premolar and mandibular second premolar extraction (4455), maxillary and mandibular second premolar extraction (5555) and other extraction patterns including only maxillary first premolar extraction, maxillary second premolar and mandibular first premolar extraction. These four patterns comprise 41.9%, 19.8%, 18.5% and 18.5% of the extraction cases, respectively. The anchorage patterns included three types, i.e., maxillary maximum anchorage (1100), maxillary and mandibular maximum anchorage (1111), and no use of maximum anchorage (0000), accounting for 29.7%, 21.6% and 48.6% of the extraction cases, respectively. Descriptions of the extraction patterns and anchorage patterns are shown in Table 3.

Table 3 Descriptions of the extraction patterns and anchorage patterns.

Full size table

The dataset is split into a training set, a validation set and a test set. The neural networks do not have access to the test set during the training process until the final evaluation of the accuracy. The reserve part of the dataset is split into a training set and a validation set with a ratio of 3/1, which is optimized according to the learning curve³⁰. The training set is used to update the weights of the network. The validation set is used to avoid overfitting¹³. Considering that we had a smaller dataset, we used a greater percentage of data to test the models. Therefore, the training set, validation set and test set were set with a typical 60/20/20 split to maintain a balance between the sets. Cases with different tags were randomly distributed to the three datasets in each simulation so that the proportions of various cases are similar among the three sets, reducing the additional bias introduced by the data partitioning process. There are 222 extraction cases; thus, the 222 cases are used in the neural network models for predicting extraction patterns and anchorage patterns. The number and percentage of different kinds of treatment plans in each set are shown in Table 4.

Table 4 Number and percentage of different kinds of treatment plans in each set.

Full size table

Network models

All three neural networks used in this work are three-layer MLPs. Each MLP consists of three full connection layers. The MLP used to determine extraction patterns is illustrated in Fig. 1b. The activation function of the hidden layer is tanh. A softmax layer of 4 outputs is applied at the end of the model^31,32. The cross-entropy^33,34 CE_tanh is given by

$$CE=-\,t\ast \mathrm{log}(y)-(1-t)\ast \mathrm{log}(1-y)$$

(1)

where t is the target value and y is the output of the MLP. Equation (1) returns a numerical value approaching infinity, which heavily penalizes output when y approaches −1 or 1. CE_tanh approaches its minimum value when y approaches t. The weight and bias values are updated according to the scaled conjugate gradient method³⁵. Although minimizing CE_tanh leads to a good accuracy of classification, considerably minimizing CE_tanh may cause overfitting. The dropout method is used to prevent overfitting^36,37. The detailed training setting including learning rate, number of epochs, batch size, et al., are provided in Supplementary Note 4.

For the extraction prediction, the model outputs a probability of extraction. We define a determination of extraction treatment for each case as the probability of extraction being higher than a cutoff value. The algorithm computes sensitivity and specificity by testing a variety of cutoff values. Varying the cutoff point in the interval 0–1 generates a conventional ROC curve. Youden’s index³⁸ is applied to obtain the optimum cutoff. If the probability is higher than the optimum cutoff, the case will be passed to the prediction of extraction patterns and anchorage patterns.

Relative contribution calculation of features and complement of the missing data

The PaD method, which is supposed to be the most useful method in giving the relative contribution and the contribution profile of the input factors, was used to evaluate the relative contribution calculation of the input features. The PaD method computes the partial derivatives of the ANN’s output with respect to the input to obtain the profile of the variations of the output for small changes of one input variable. For a network with n_i inputs (where i represents the feature index and i = 1, 2, …, 24 in this work), one hidden tanh layer with n_h neurons, and no outputs, the partial derivatives of the output y_j with respect to input x_j (where j represents the case index and j = 1, 2, …, 302 in this work) are:

$${d}_{ij}={S}_{j}\sum _{h=1}^{{n}_{h}}{w}_{ho}(1-{I}_{hj}^{2}){w}_{ih}$$

(2)

where S_j is the derivative of the output neuron with respect to the input, which is the weights between the output neuron and h_th hidden neuron, I_hj is the output of the h_th hidden neuron, and w_ih is the weights between the i_th input neuron and the h_th hidden neuron.

Then, the relative contribution of the ANN’s output to the dataset with respect to the i_th input feature can be calculated by a sum of the square partial derivatives as:

$$SS{D}_{i}=\sum _{j=1}^{N}{d}_{ij}^{2}$$

(3)

where N is the data size and equals 302 in this work. The SSD values enable direct access to the influence of each input variable on the output.

We use the average value method, frequent value method, specific value method, median value method, and k-NN method to study their complement effects. The four traditional methods complement the missing data with the average value, the frequent value, the specified value (standard value of normal population), and the median value. The k-NN method is a method to look for the new case’s nearest neighbors from the complete cases and use an estimated value to replace the missing data^26,39. This value is the weighted average of the values of its k nearest neighbors. We used 2-k-NN (2 nearest neighbors) and 3-k-NN (3 nearest neighbors) in this study. Each neighbor is given a weight of 1/d, where d is the distance to the neighbor. The neighbors are taken from the dataset for which the object property value is known.

Data Availability

The present neural network models, medical data of example patients, features descriptions and training setting are provided in the Supplementary Information.

References

Krooks, L., Pirttiniemi, P., Kanavakis, G. & Lahdesmaki, R. Prevalence of malocclusion traits and orthodontic treatment in a Finnish adult population. Acta odontologica Scandinavica 74, 362–367, https://doi.org/10.3109/00016357.2016.1151547 (2016).
Article CAS PubMed Google Scholar
Feldens, C. A. et al. Impact of malocclusion and dentofacial anomalies on the prevalence and severity of dental caries among adolescents. The Angle orthodontist, https://doi.org/10.2319/100914.1 (2015).
Mary, A. V. et al. Assessing Quality of Life using the Oral Health Impact Profile (OHIP-14) in Subjects with and without Orthodontic Treatment need in Chennai, Tamil Nadu, India. Journal of clinical and diagnostic research: JCDR 11, Zc78–zc81, https://doi.org/10.7860/jcdr/2017/27309.10442 (2017).
Article PubMed Google Scholar
Proffit, W. R., Fields, H. W. Jr. & Moray, L. J. Prevalence of malocclusion and orthodontic treatment need in the United States: estimates from the NHANES III survey. The International journal of adult orthodontics and orthognathic surgery 13, 97–106 (1998).
CAS PubMed Google Scholar
Association, A. D. Oral Health and Well-Being in the United States, https://www.ada.org/en/science-research/health-policy-institute/oral-health-and-well-being (2015).
Proffit, W. R., Fields, H. W. Jr. & Sarver, D. M. Contemporary orthodontics. 5th edn, (Mosby, 2013).
Hammond, R. & Freer, T. J. Application of a case-based expert system to orthodontic diagnosis and treatment planning: a review of the literature. Australian orthodontic journal 14, 150 (1996).
CAS PubMed Google Scholar
Noroozi, H. Orthodontic treatment planning software. American journal of orthodontics and dentofacial orthopedics 129, 834–837 (2006).
Article Google Scholar
Takada, K., Yagi, M. & Horiguchi, E. Computational formulation of orthodontic tooth-extraction decisions. Part I: to extract or not to extract. The Angle orthodontist 79, 885–891 (2009).
Article Google Scholar
Yagi, M., Ohno, H. & Takada, K. Computational formulation of orthodontic tooth-extraction decisions. Part II: which tooth should be extracted? The Angle orthodontist 79, 892–898 (2009).
Article Google Scholar
Worden, K., Staszewski, W. J. & Hensman, J. J. Natural computing for mechanical systems research: A tutorial overview. Mechanical Systems and Signal Processing 25, 4–111 (2011).
Article ADS Google Scholar
Jia, F., Lei, Y., Lin, J., Zhou, X. & Lu, N. Deep neural networks: A promising tool for fault characteristic mining and intelligent diagnosis of rotating machinery with massive data. Mechanical Systems and Signal Processing 72, 303–315 (2016).
Article ADS Google Scholar
Xie, X., Wang, L. & Wang, A. Artificial neural network modeling for deciding if extractions are necessary prior to orthodontic treatment. The Angle orthodontist 80, 262–266, https://doi.org/10.2319/111608-588.1 (2010).
Article PubMed Google Scholar
Jung, S. K. & Kim, T. W. New approach for the diagnosis of extractions with neural network machine learning. American journal of orthodontics and dentofacial orthopedics: official publication of the American Association of Orthodontists, its constituent societies, and the American Board of Orthodontics 149, 127–133, https://doi.org/10.1016/j.ajodo.2015.07.030 (2016).
Article Google Scholar
Lee, R., Macfarlane, T. & O’Brien, K. Consistency of orthodontic treatment planning decisions. Clinical Orthodontics & Research 2, 79–84 (1999).
Article CAS Google Scholar
Luke, L. S., Atchison, K. A. & White, S. C. Consistency of patient classification in orthodontic diagnosis and treatment planning. Angle Orthodontist 68, 513 (1998).
CAS PubMed Google Scholar
Sterne, J. A. et al. Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. British medical journal 339, 157–160 (2009).
Google Scholar
Jerez, J. M. et al. Missing data imputation using statistical and machine learning methods in a real breast cancer problem. Artificial intelligence in medicine 50, 105–115 (2010).
Article Google Scholar
García-Laencina, P. J., Abreu, P. H., Abreu, M. H. & Afonoso, N. Missing data imputation on the 5-year survival prediction of breast cancer patients with unknown discrete values. Computers in biology and medicine 59, 125–133 (2015).
Article Google Scholar
Aydilek, I. B. & Arslan, A. A novel hybrid approach to estimating missing values in databases using k-nearest neighbors and neural networks. International Journal of Innovative Computing, Information and Control 7, 4705–4717 (2012).
Google Scholar
Jayalakshmi, T. & Santhakumaran, A. In Data Storage and Data Engineering (DSDE), 2010 International Conference on. 159–163 (IEEE, 2010).
García-Laencina, P. J., Sancho-Gómez, J.-L. & Figueiras-Vidal, A. R. Pattern classification with missing data: a review. Neural Computing and Applications 19, 263–282 (2010).
Article Google Scholar
Ennett, C. M., Frize, M. & Walker, C. R. Influence of missing values on artificial neural network performance. Stud Health Technol Inform 84, 449–453 (2001).
CAS PubMed Google Scholar
Chen, H., Grant-Muller, S., Mussone, L. & Montgomery, F. A study of hybrid neural network approaches and the effects of missing data on traffic forecasting. Neural Computing & Applications 10, 277–286 (2001).
Article Google Scholar
Gevrey, M., Dimopoulos, I. & Lek, S. Review and comparison of methods to study the contribution of variables in artificial neural network models. Ecological modelling 160, 249–264 (2003).
Article Google Scholar
Batista, G. E. & Monard, M. C. A Study of K-Nearest Neighbour as an Imputation Method. HIS 87, 251–260 (2002).
Google Scholar
Martina, R., Teti, R., D’Addona, D. & Iodice, G. Neural network based system for decision making support in orthodontic extractions. Intelligent Production Machines and Systems, 235–240 (2006).
Dedecker, A. P. et al. Application of artificial neural network models to analyse the relationships between Gammarus pulex L. (Crustacea, Amphipoda) and river characteristics. Environmental monitoring and assessment 111, 223–241, https://doi.org/10.1007/s10661-005-8221-6 (2005).
Article CAS PubMed Google Scholar
Graber, L. W., Vanarsdall, R. L., Vig, K. W. & Huang, G. J. Orthodontics: Current Principles and Techniques. (Elsevier Health Sciences, 2016).
Guyon, I. A scaling law for the validation-set training-set size ratio. AT&T Bell Laboratories, 1–11 (1997).
Mavani, V., Raman, S. & Miyapuram, K. P. In Computer Vision Workshop (ICCVW), 2017 IEEE International Conference on. 2783–2788 (IEEE).
Yang, J. B., Nguyen, M. N., San, P. P., Li, X. L. & Krishnaswamy, S. In 2015 International Conference onArtificial Intelligence (Ijcai). 3995–4001.
Kshirsagar, A., Seftel, A., Ross, L., Mohamed, M. & Niederberger, C. Predicting hypogonadism in men based upon age, presence of erectile dysfunction, and depression. International journal of impotence research 18, 47 (2006).
Article CAS Google Scholar
Boixo, S. et al. Characterizing quantum supremacy in near-term devices. Nature Physics 14, 595 (2018).
Article ADS CAS Google Scholar
Møller, M. F. A scaled conjugate gradient algorithm for fast supervised learning. Neural networks 6, 525–533 (1993).
Article Google Scholar
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R. Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research 15, 1929–1958 (2014).
MathSciNet MATH Google Scholar
Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R. R. Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012).
Youden, W. J. Index for rating diagnostic tests. Cancer 3, 32–35 (1950).
Article CAS Google Scholar
Altman, N. S. An introduction to kernel and nearest-neighbor nonparametric regression. The American Statistician 46, 175–185 (1992).
MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

State Key Laboratory of Oral Diseases & National Clinical Research Center for Oral Diseases & Department of Orthodontics, West China Hospital of Stomatology, Sichuan University, Chengdu, 610041, P.R. China
Peilin Li, Tian Tang, Di Su, Pu Yang, Huixia Wang & Zhihe Zhao
State Key Laboratory of Electronic Thin Films and Integrated Devices, University of Electronic Science and Technology of China, Chengdu, 610054, P. R. China
Deyu Kong & Yang Liu

Authors

Peilin Li
View author publications
You can also search for this author in PubMed Google Scholar
Deyu Kong
View author publications
You can also search for this author in PubMed Google Scholar
Tian Tang
View author publications
You can also search for this author in PubMed Google Scholar
Di Su
View author publications
You can also search for this author in PubMed Google Scholar
Pu Yang
View author publications
You can also search for this author in PubMed Google Scholar
Huixia Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhihe Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Yang Liu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

P.L., D.K., Z.Z. and Y.L. conceived and designed the experiments. Z.Z. and T.T. made the orthodontic treatment plans of the patients. P.L., D.S., P.Y. and H.W. collected the medical records. P.L., D.K. analyzed the data and wrote the main manuscript text. All authors reviewed the manuscript.

Corresponding authors

Correspondence to Zhihe Zhao or Yang Liu.

Ethics declarations

Competing Interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Li, P., Kong, D., Tang, T. et al. Orthodontic Treatment Planning based on Artificial Neural Networks. Sci Rep 9, 2037 (2019). https://doi.org/10.1038/s41598-018-38439-w

Download citation

Received: 21 August 2018
Accepted: 21 December 2018
Published: 14 February 2019
DOI: https://doi.org/10.1038/s41598-018-38439-w

This article is cited by

RGB oralscan video-based orthodontic treatment monitoring
- Yan Tian
- Hanshi Fu
- Ruili Wang
Science China Information Sciences (2024)
Facial profile evaluation and prediction of skeletal class II patients during camouflage extraction treatment: a pilot study
- Runzhi Guo
- Yuan Tian
- Yannan Sun
Head & Face Medicine (2023)
Evaluation of artificial intelligence model for crowding categorization and extraction diagnosis using intraoral photographs
- Jiho Ryu
- Ye-Hyun Kim
- Seok-Ki Jung
Scientific Reports (2023)
The prediction of sagittal chin point relapse following two-jaw surgery using machine learning
- Young Ho Kim
- Inhwan Kim
- Hwa Sung Chae
Scientific Reports (2023)
A novel machine learning model for class III surgery decision
- Hunter Lee
- Sunna Ahmad
- Hakan Turkkahraman
Journal of Orofacial Orthopedics / Fortschritte der Kieferorthopädie (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

Orthodontic treatment planning with ANNs

The accuracies of the ANNs

Relative contribution of features for planning decisions

Comparison of different complement methods

Discussion

Methods

Cases Collection

The composition of the cases and datasets

Network models

Relative contribution calculation of features and complement of the missing data

Data Availability

References

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing Interests

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links