From the woodblock printing dating to before 220 AD to the currently used digital printing, printing has definitely come a long way. Newspapers, magazines and books are just some of the end-products of what is considered the 5th largest of the global manufacturing industries. Nowadays, businesses from various global industries are positively affected by the implementation of machine learning. This mainstream implementation has represented a wake-up call for the print industry as well, which became aware of the potential value of machine learning only in recent years. 15 trainees of the PDEng Software Technology program of TU/e were asked by Océ–Canon to use machine learning for the identification and clustering of different paper types (‘media’), in order to identify the right print settings for each media and ultimately improve the quality and efficiency of the printing process.
When Lodewijk van der Grinten opened his first pharmacy in his home town of Venlo, back in 1857, little did he know about the impact of his entrepreneurial spirit. The successful story of Océ-Canon - which is now considered as one of the biggest economic powers in Venlo and its entire region - started with some chemistry recipes, a piece of butter and its coloring via some shredded carrots, the “reddest carrots possible”. Acquired in 2010 by Canon Inc. of Japan, Océ-Canon has nowadays clients ranging from “creative studios around the corner to blue-chip multinationals around the globe”. Customers in these environments are focused on up time, high quality and on-demand applications.
Lodewijk records his recipe for butter coloring.
“Take the reddest carrots possible and shred them with a knife. Take warm milk and place the shredded carrots in it. Stir and sieve it through a linen cloth to strain it, add to the butter, and beat it”
(Source: Océ –Canon)
The PDEng project
For this training project, the PDEng trainees were asked to work with the Oce VarioPrint i300, which produces approximately 18000 sheets per hour. Printed sheets can be made of several media types. Choosing from the various types of media available on the market is a key factor in achieving top quality digital printing. You need to choose the paper type that is best suited to bringing out the visual and textual contents of your publishing or graphic design project, a choice that should be made carefully and that will vary depending on the type of product you wish to print.
The problem
“A problem we are currently facing in the field is that we are unable to identify the actual media that are being used because operators are free to change the names of media. Also, sometimes, identical media is sold under different names, or a certain type of media changes over time due to, for example, differences in the locations of the trees and the type of soil used to grow those”, says Patrick Vestjens.
Choosing from the range of paper types on offer is not always easy, especially when dealing with a huge array of sheets (‘stacks’), all with completely different properties. Pranav Bhatnagar, ST trainee and project manager of this assignment: “Each type of paper has a weight, a processing type and a finish, which determine its transparency, appearance, weight, thickness, level of opaqueness, feel and durability”. So, how to identify the correct media stack and choose the optimal print settings? Different media have indeed different print settings, so they need to be identified in order to use the right print settings. This is when machine learning comes into play.
Machine learning
Peter Kruizinga: “When the printer operates, the integrated sensors make possible the collection of a large amount of data. Thus, we wanted to explore the possibility to identify media based on this data. The idea is that each media has its own characteristics that are reflected in the data as a sort of media finger print”. To fulfill this need, the PDEng trainees were asked by the Océ-Canon team to use machine learning. Bhatnagar: “Machine learning can be used to identify the right media, so to use the correct printer settings. It can also be used to print unknown media with the most appropriate printer settings. Lastly, via machine learning, data that do not have to be stored anymore could be identified and removed, ultimately saving space in clusters.”
Workflow and work teams
For this project, data mining was performed by Océ-Canon, while the PDEng trainees focused on the exploratory data analysis (EDA) and the data preprocessing, the training of the model, the testing of the data and, ultimately, the improvement of the model. To ensure a smooth workflow and the distribution of the work based on expertise with the team, the 15 PDEng trainees divided into different groups, namely the feature engineering team, the clustering team and the classification team.
Data Science process
Project challenges and support from Océ-Canon
Bhatnagar: “The project was challenging but in a positive way. We had a very short time to complete the project, and we lacked knowledge on the environment in which the printers’ systems operates. Also, our experience with machine learning was limited”. Coping with these gaps while trying to produce results was, in the words of Bhatnagar, “the most challenging part”. Yet, the trainees could still count on the regular support from the Océ-Canon team. Konstantinos Karmas, ST trainee and lead architect of the project: “Patrick Vestjens (Domain Architect) and Peter Kruizinga (Lead Technologist) were always available and responsive when we needed them. They were willing to answer the long list of questions we sent on a weekly basis. Getting timely inputs was crucial to clear out our doubts, both domain-wise and technically, and for the progress of the project. Their reactivity was exemplary, and they tried to be as much involved as they could. This was all very motivating for the team.”
Results and future work
In two months’ time, the PDEng trainees were able to successfully compile a report on actual Océ case results, findings and conclusions. In addition, the team worked at the development of a report on machine learning tools and at a data preprocessing script, which can be used in the future by Océ-Canon. Lastly and as a result of the PDEng trainees’ work, the Océ-Canon team can now count on a model to identify media based on sensors (and data) implemented in the Canon Printer i300. “In the future”, says Karmas, “more attention should be given to the clustering techniques, and to the cross-validation of those results with classification.” Given the relatively short time frame of the project and the complexity of the problem, the Océ-Canon team is very satisfied with the final outcome. “The results that the PDEng students showed are impressive”, they say. “The model looks very promising, although it still needs to be validated more thoroughly in terms of applicability across printers and accuracy over time.”
PDEng trainees and Yanja Dajsuren (Program Director of the PDEng ST program) with Wim Verhofstad, Patrick Vestjens and Peter Kruizinga at Océ-Canon in Venlo.
Success comes from happy (and rewarded) people
“The most important thing I learned from this experience is to embrace uncertainty”, says Bhatnagar. “At first, the team was not completely aware of the directions the project would have followed, yet taking one step at a time helped a lot. Also, as a project leader, I realized that there is no success without a happy team. Happiness was the driving factor for us, and also what kept the motivation and the communication up. Last but not least, the genuine interest of our client and expertise support in our project made our team feel important and our efforts rewarded.”
PDEng ST track
The Software Technology (ST) program is a salaried two-year technological designer program on a doctorate level for MSc graduates with a degree in the field of Computer Science or a related field. Like a PhD candidate, you will have the status of TU/e employee after being selected for this program. The Software Technology program is designed to prepare trainees for an industrial career as a technological designer, and later on as a software or system architect. It starts with 14 months of advanced training and education, including 3 small, industry driven training projects, followed by a major design project of 10 months in a company.
Artificial Intelligence
Recently, the PDEng Software Technology has expanded its program, covering the field of Artificial Intelligence (AI) via courses, workshops, hackathons and trainings in companies. The project with Océ-Canon is in line with this shift, as it addressed, for the first time, both artificial intelligence AI and software design challenges. Yanja Dajsuren, Program Director of the PDEng ST program: “Last year 25% of our graduation projects were about data-driven architecture and intelligent systems, in collaboration with, to name a few, Hendrix Genetics, EIT Digital and Thermo Fisher Scientific. We are proud to give our contribution to the rapidly expanding field of Artificial Intelligence, which is estimated to reach $2.9 trillion in business value by 2021. We are confident that our program will contribute to the development of innovative AI platforms, tools, methods, and strengthen the collaboration of the most innovative minds in industry and academia.”