Hands on Tutorial for: State of the Art AI implementation in Manufacturing — Part 1

Here is first part of the guide on implementing ML and AI solutions into manufacturing company. Following guide can be applied to all types of factories and products, which generate structured data preferable stored in the databases. It will work best for high volume products that get tested and have measurable output properties e.g.: resistance, latency, frequency, torque, power, energy consumption, pressure, speed, vibration, strength, clearance, efficiency, timing, thrust and all possible numeric or categorical properties that can be measured and are important for final characteristic and production yield.

In terms of name for that system it could be called: Automated Production Optimisation or simply Digital Twin of Process and/or product.

As you can imagine following examples suggest it can be used for things such as: PCB, Jet and rocket engines, gearboxes, combustion engines and all other mechanical, electronic, pneumatic and hydraulic devices.

Let’s pretend it’s 2050 and you are AI expert in Super Giga Factory. Imagine that you have a task to produce 100000 rocket engines a Month each being assembled of 100 key components each described with 50 key dimensional features. Each feature tolerance is very tight ranging from 0.0005mm to 0.01mm. Additionally you have multiple machines manufacturing the same components each generating slightly different results. Your task is to be able to pass final product test which in your case is trying to achieve 1.7 million pounds of thrust with 0.05 pound precision. Currently without AI support the final pass rate on that test is only 40%. 60% of engines failed even though all components were in their drawing limits but some of the combinations simply don’t work and you don’t know why. Manual analysis and decision making is impossible as the dimensionality overwhelms you. Here is simple chart that we will get to once we go through all parts of that guide.

In terms of software requirements most of the modules are free, there is always a choice of premium tools e.g. module manager Anaconda, but above project can be fully compiled based on quality open source Python libraries. Some people have mixed reaction when they hear open source used in larger businesses, but in a Data Science and Machine Learning world this is bread and butter- as those are the most developed tools -actually being developed by thousands of contributors on services as Github. For example Keras (deep learning library) is used in places like NASA ans CERN (LHC) as it’s most efficient when it comes to designing neural networks. Here is basic list of those libraries:

  1. NumPy
  2. SciPy
  3. Pandas
  4. Matplotlib
  5. Seaborn
  6. Scikit Learn
  7. Dvc
  8. TensorFlow
  9. Keras
  10. Itertools
  11. Conda

This is the steps that we need to take to fully automate any manufacturing process in a form of closed feedback loop that has already got components traceability and functional test of the assembled product/sub-assembly:

1. Goal-Setting and Application Understanding

Decide what is your ultimate goal to be achieved by that closed loop for example it can be:

  • maximize pass rate on final functional test
  • minimize part to part variation (size of output population)
  • steer population of certain or some output parameters up or down within tolerance limit
  • find out optimal settings for certain configuration of output results

In terms of application understanding you need a list of all important inputs and outputs that are critical for that process based on your current understanding.

2. Data Selection and Integration

Based on goals and inputs and outputs, the data collected needs to be selected and organized into meaningful set based on availability, accessibility importance and quality. You need to think which parameters can be always delivered for every single component. Consider data accuracy and validation — you don’t want incorrect and noisy data.

3. Data Cleaning and Preprocessing

Search for missing data and remove noisy, redundant and low-quality data from the data set to improve the reliability of the data and its effectiveness. In this step you could include all standarisation algorithms.

4. Data Transformation

Depending on complexity and accuracy of the processes this stage can vary between basic and major transformations. If you think that relationship between input and output parameters is not linear you can generate polynomial and interaction features(.PolynomialFeatures). That will increase your input size (number of numeric columns), so in order to squeeze it you could use PCA (.PCA) after that. In this step you can also focus on engineering new features which may involve a lot of techniques for example:

-average of certain columns

-rolling average of column/s with different size of period/row count

  • subtractions, additions etc

Feature transformation and generation is a very broad field and can improve you final accuracy when done correctly.

As last thing in that chapter we will focus on Data Version Control -it’s relatively new field which focuses on controlling changes in your data sets -either training or production sets. We will use DVC module for that.

5. Neural network design

Using Keras you will design neural network model which aims to learn from historical structure data. It will take all the manufacturing data that you defined as inputs before you functional test, where your output is functional test results. With trained network you will be able to predict functional test results for every product that comes through the line. Based on it you can estimate chances of that part passing your test with accuracy that depends on your training and data quality. With that model you can also simulate any possible configuration of components to see predicted performance.

6. Saving Model

In that step we will define how to save your trained model structure , weights and all other utilities -which are standarization, polynomial, PCA and similar objects … yes those things have to be saved as well as their versions have to be controlled if you want to avoid future issues.

6. Start generating and saving live predictions

Next step is to actually predict your expected results for the components that are currently in production and will reach your line soon. We want to see which parameters have potential to be improved and corrected. With database query we pull data for all manufactured components e.g. in last day. Here we can either simulate only the products which are actually assembled (with their known components) or we can generate list of all possible configurations of given finite number of components. We will have to be careful due to curse of dimensionality: being given 5 components types group of 100 units in each group the number of possible combinations will go to the moon very quickly and depending on complexity of your data and neural model your processing time will become irrational and unnecessarily slow one most of CPUs. Here depending on specification of your production we can either go for very short period of time (e.g. last 60mins)or we can randomly choose just a % of whole production.

Eventually we will end up with a lot of predicted functional test results based on production inputs.

7. Analyse predictions/optimise

In that step there is 2 choices of optimisation:
Analyse data manually using charts, correlations etc. and workout corrections for manufacturing dimensions and parameters or use automated optimisation algorithm e.g.scipy.optimize.minimize.
The first option is standard data sciense and statistics analysis based on live predictions which involves human input. There is nothing that stops you from doing those 2 things in parallell and again
environment and product specification will affect that choice, but ultimately we want to leave the system living it’s own live and still delivering best possible results.

In previous step we generate predictions for current production, but we don’t necessarily need them if we just use optimisation algorithm. It’s still good to have that historical data though to be able to monitor accuracy vs actual results.

We will try multiple types of optimisation algorithms that will generate and save recommendations for all inputs that we choose. At this step we will need a list of all drawing and production limits for your features.

8. Use recommendations

Once the recommended settings are saved on database we need to configure machines to be able to pull them automatically from that source. It’s standard programming task. We can choose which machines, parameters and frequency we want to use for that automated process. In case of technical limitations we might need to use operating staff to read recommendations and input them into machine. You can see now why developing full IoT manufacturing machines is that important when it comes to advanced manufacturing.

9. Verification

Once the changes are made we want to verify the results. Again it’s standard programming/ analytic task and depending on our goal we will be looking for different things. Initial phase is normally very dynamic: a lot of things will need adjustment and it’s suggested to start with lower number of controlled features and as we get confidence expand our auto-control.

10. Maintenance & Continuous Improvements

Systems like that can be fragile, unless they are closely monitored and all physical factors are in good shape. We need to decide if and how often to retrain our neural network model and other utilities. Data source will also be a subject of constant development as our parameter spectrum evolves. It takes time to get confidence, but sooner the organization starts using that sort of solution the faster it gets to the stage when this feels normal and people are comfortable with those tools.

As you see from that brief summary this kind of system is not that complicated and could be successfully implemented across multiple areas of manufacturing. Additional benefits of having this system are: higher focus on sensor measurement quality across all operations due to their value added to process, better understanding of interactions within manufacturing chain and product function, potential to drive further design improvements for future product updates and new developments. Beauty of that system is that it's got unlimited potential for scalling both vertically and horizontally. As you probably realised applications can be wider than just manufacturing.

If you liked it please leave a follow and clap to be able to see next detailed chapters of that guide.

AI/ data science engineer