Data centers form the backbone of almost all technologies and services we use today in banks, hospitals, governments, and everything else. The way they store, process, integrate, and utilize data, which is the new oil of the digital economy, plays an increasingly important role in the success – or failure – of businesses.
Today’s ultra-competitive business environment demands that data be accessible instantly and round the clock, yet secure from theft and rogue usage. Video streaming, social media, big data, artificial intelligence, bitcoin, and digitalization of business processes and production flows. These trends are leading to more and more data being stored and processed in data centers.
As the complexity and variety of data and technology grow, data centers continue to grow and proliferate to meet current and future demands and adapt to today’s dynamic business and technology environment. As a result, data centers are becoming bigger in size and more complex in terms of the number and behavior of their internal components.
With terabytes of data per second blasting through several data centers every millisecond, Thomas Clausen, professor and researcher at the Computer Science Laboratory (LIX) at École Polytechnique, along with his team is currently working on the reorganization and restructuring of data centers for higher efficient reception, processing and distribution of data.
A graduate of Aalborg University, Denmark, and a Senior Member of the IEEE, Thomas Clausen has, since 2004, been on faculty at École Polytechnique, France’s premiere technical and scientific university, where he holds the Cisco endowed “Internet of Everything” academic Chair and leads the computer networking research group. Thomas has published more than 80 peer-reviewed academic publications (which have attracted more than 12,000 citations). We had a chance to interview Thomas Clausen about data centers and their increasingly critical role in artificial intelligence. You can read the interview below:
1. All AI systems leverage the power of big data to predict outcomes and successful courses of action. They analyze an exponential amount of data from hundreds of different sources. This means that for AI to work efficiently or economically, it needs AI-enabled, next-generation data centers to strategically optimize the performance and workloads. Tell us about how data centers can revolutionize AI technologies.
Yes, AI — and more broadly — Data Science relies on the ability to efficiently process data in a data center. Now, a data center is little more than a room with rows, and rows, of racks — each rack with rows and rows of computers. For a given workload — for example, a (complete or partial) dataset on which to run an AI algorithm — a very mechanical challenge is to “place that workload on one of these computers.” The question becomes “on which computer should an incoming workload be placed?” — with the goal being for the workload to be executed “as quickly as possible,” of course. And this is one area where there is room for innovation within the data center.
See, the entity that places the workload on a given computer is called a Load Balancer (LB), and it can apply one of several strategies.
For example, workloads can be placed at a “random” computer in the data center. However, it can be observed that this does lead to some computers being “overloaded” with others being idle.
An LB can also place the workloads on the computers in the data center in a round-robin fashion. In this case, assuming that all workloads have the same “size,” the same execution time, that would lead to an even load on all computers, at least on average — however that assumption is rarely, almost never, satisfied, and it is probably impossible to determine the execution time for an arbitrary incoming workload before actually executing it (this is an instance of the “Halting Problem”).
These two strategies are “blind” in that the LB does not observe the actual state of the individual computers in the data center — or, in other words: the LB is stateless. A different family of load balancing strategies have the LB be stateful and are based on continuously “monitoring” the state of each computer and then “placing” a workload on the least loaded computer in the data center. However, there are three problems with that: (i) the overhead — the number of communications between the LB and the computers — of monitoring this state is considerable, (ii) the information available to the LB will always be inaccurate, as a function of the monitoring frequency, and (iii) the load balancer needs to maintain and process significant volumes of information, and be able to make instantaneous decisions on which computer to place an incoming workload.
We have shown that, by applying a decentralized decision strategy where the LB designates not a single, but an ordered list of computers and the first computer in that list that has the capacity to accept the workload does so (otherwise, it forwards it to the next computer in the list) can afford an up to 20% overall response time improvement for a data center. This, while keeping the LB entirely stateless and — essentially — proposing the “ordered list of computers” by applying a simple hash function on the incoming workload.
This is, of course, an encouraging result — but we believe that we can do even better by allowing the LB to observe, for example, the completion time of workloads when placed on individual computers and use Machine Learning and AI on such indirect signals to infer the workload of these. This information would be used to inform the selection of the “ordered list of computers” that the LB provides for the distributed decision process. This is ongoing (and not yet published) work, so precise results are forthcoming.
2. You and your team have been working on acquiring and accumulating qualitative and quantitative data to build a platform for analyzing smart buildings that can reduce harmful effects on the environment. Can you tell us more about this project?
Actually, the key person within my team for this project is Dr. Jiazi Yi (Computer Science Laboratory (LIX) at École Polytechnique) — and the scope is larger than just my team. This is an interdisciplinary project, in partnership with our colleagues from the Laboratory for Dynamic Meteorology (LMD) at École Polytechnique, where the key person is Dr. Jordi Badosa.
On the project itself, the basis for making decisions is to have data upon which to make these decisions. Using the Drahi-X Novation Center building on the École Polytechnique campus as a test case, we are instrumenting it to acquire fine-grained and qualitative data about energy consumption, occupancy, etc. This implies developing sensors and detectors that can be — unobtrusively — installed, developing the communications infrastructure that allows the data from these to be conveyed to a data platform, and of course, development and maintenance of that data platform. This allows us to know — for example — something about energy consumption profiles. More results to come on this at a later date.
On the roof of the Drahi-X Novation Center, there are solar panels installed — with the goal being to be (as much as possible) self-sufficient with energy and thus not call upon grid supply. That, therefore, requires to match the energy consumption to what is being produced by the solar panels as much as possible. Combining the fine-grained consumption and occupancy data with accurate meteorological predictions — and thereby, predictions of how much energy can be available from the solar panels — allows better matching for the consumption and the production, for example.
Another observation is that within the Drahi-X Novation Center, the IT infrastructure — which includes a small data center — is a significant consumer of energy. Classically, this energy consumption is considered “incompressible” since it is not possible to simply “turn off a rack in a data center.” It would be assumed to cause data loss. However, there again, we have shown that it is possible to, at least partly, modulate the number of computers that are active in a data center, thus giving the ability to “turn off” those that are, at a given time, idle yet still consuming energy. Taking this a step further will be to — with the meteorological predictions for when there will be “excess energy produced by the solar panels” — schedule workloads for execution on the data center and conversely, decide to not schedule non-critical workloads when there is no excess energy available.
Dr. Michalis Vazirgiannis is another professor at l’X who is currently researching big data mining and machine learning, aiming at harnessing the full potential of machine learning algorithms for large scale datasets. From his research on deep and machine learning for Graph Analysis (including community detection, graph classification, clustering and embeddings, and influence maximization), Text Mining, including Graph of Words, deep learning for NLP tasks and applications, he believes that with its increased and uncontrolled use, AI has a dark side, especially when it also has the potential of automation of human jobs or self-consciousness.
3. When we talk about data mining, AI, machine learning technologies being used by tech giants, there is always a dark side, with the increased and uncontrolled use. Can you share some of the pressing concerns today?
Indeed, AI algorithms are about applying algorithms to large-scale data with the objective of predicting some interesting labels or learning the patterns/structure of the data. Therefore, access to large-scale quality data is detrimental to AI.
Data are like fuel to the AI engine. Currently, the vast amounts of data produced by users’ behavior and interactions (i.e., browsing, content consuming, purchase behaviors …) are mainly owned by the private entities that capitalize on the knowledge extracted from these data.
The problematic issue is that these data are used and combined without the real consent of users that, in the end, are those that produce them and therefore should be owners of.
Therefore, it becomes essential that users’ data are (properly anonymized) available to society and governments as well for policy and decision making. A key political question of our era is how to regulate the ownership of data. Of course, then the issue of privacy becomes important as well.
4. What are the ways to solve these challenges and work for the betterment of humanity through technology and science?
AI brings an unprecedented evolution to what machines and algorithms can do and strongly affects society, the economy, and all aspects of life. Ways to tackle the above issues include:
– Society involvement in order to capitalize on the gains of AI for the common good with the added value of AI being distributed to society as widely as possible. This can be achieved through continuous training of the population on the new concepts and capacities of AI as well as involving societies via democratic processes – keeping in mind the inevitable transformation of societies and economies through AI.
– Data and computation sovereignty: Data, algorithms, and processing power are the three elements of AI that are mainly controlled by powerful industrial entities. It is essential that these elements are also provided via political decisions as public resources to the population, which helps to avoid discrimination. People should be able to have access to their own behavioral data and profiles as a start.
– Explicability of AI-backed decisions: It is essential that decisions made by AI algorithms are explained to those affected so that they can understand and reason on them. For example, a relevant initiative is “Data for Good,” that among others, promotes transparency on the logic and the source codes of algorithms for affected users.
-  Y. Desmouceaux, P. Pfister, J. Tollet, M. Townsley and T. Clausen, "6LB: Scalable and Application-Aware Load Balancing with Segment Routing," in IEEE/ACM Transactions on Networking, vol. 26, no.2, pp. 819-834, April 2018, doi: 10.1109/TNET.2018.2799242.
-  Y. Desmouceaux, M. Enguehard, and T. Clausen" Joint Monitorless Load-Balancing and Autoscaling for Zero-Wait-Time in Data Centers”, IEEE Transactions on Network and Service Management. (Paper accepted, to be published soon)