Computer chips have stopped getting faster: The regular performance improvements we’ve come to expect are now the result of chipmakers’ adding more cores, or processing units, to their chips, rather than increasing their clock speed.

In theory, doubling the number of cores doubles the chip’s efficiency, but splitting up computations so that they run efficiently in parallel isn’t easy. On the other hand, say a trio of computer scientists from MIT, Israel’s Technion, and Microsoft Research, neither is it as hard as had been feared.

Commercial software developers writing programs for multicore chips frequently use so-called “lock-free” parallel algorithms, which are relatively easy to generate from standard sequential code. In fact, in many cases the conversion can be done automatically.

Yet lock-free algorithms don’t come with very satisfying theoretical guarantees: All they promise is that at least one core will make progress on its computational task in a fixed span of time. But if they don’t exceed that standard, they squander all the additional computational power that multiple cores provide.

In recent years, theoretical computer scientists have demonstrated ingenious alternatives called “wait-free” algorithms, which guarantee that all cores will make progress in a fixed span of time. But deriving them from sequential code is extremely complicated, and commercial developers have largely neglected them.

In a paper to be presented at the Association for Computing Machinery’s Annual Symposium on the Theory of Computing in May, Nir Shavit, a professor in MIT’s Department of Electrical Engineering and Computer Science; his former student Dan Alistarh, who’s now at Microsoft Research; and Keren Censor-Hillel of the Technion demonstrate a new analytic technique suggesting that, in a wide range of real-world cases, lock-free algorithms actually give wait-free performance.

“In practice, programmers program as if everything is wait-free,” Shavit says. “This is a kind of mystery. What we are exposing in the paper is this little-talked-about intuition that programmers have about how [chip] schedulers work, that they are actually benevolent.”

The researchers’ key insight was that the chip’s performance as a whole could be characterized more simply than the performance of the individual cores. That’s because the allocation of different “threads,” or chunks of code executed in parallel, is symmetric. “It doesn’t matter whether thread 1 is in state A and thread 2 is in state B or if you just swap the states around,” says Alistarh, who contributed to the work while at MIT. “What we noticed is that by coalescing symmetric states, you can simplify this a lot.”

In a real chip, the allocation of threads to cores is “a complex interplay of latencies and scheduling policies,” Alistarh says. In practice, however, the decisions arrived at through that complex interplay end up looking a lot like randomness. So the researchers modeled the scheduling of threads as a process that has at least a little randomness in it: At any time, there’s some probability that a new thread will be initiated on any given core.

The researchers found that even with a random scheduler, a wide range of lock-free algorithms offered performance guarantees that were as good as those offered by wait-free algorithms.

That analysis held, no matter how the probability of thread assignment varied from core to core. But the researchers also performed a more specific analysis, asking what would happen when multiple cores were trying to write data to the same location in memory and one of them kept getting there ahead of the others. That’s the situation that results in a lock-free algorithm’s worst performance, when only one core is making progress.

For that case, they considered a particular set of probabilities, in which every core had the same chance of being assigned a thread at any given time. “This is kind of a worst-case random scheduler,” Alistarh says. Even then, however, the number of cores that made progress never dipped below the square root of the number of cores assigned threads, which is still better than the minimum performance guarantee of lock-free algorithms.

By Larry Hardesty, MIT News Office

MIT professors Shafi Goldwasser and Silvio Micali have won the Association for Computing Machinery’s (ACM) A.M. Turing Award for their pioneering work in the fields of cryptography and complexity theory. The two developed new mechanisms for how information is encrypted and secured, work that is widely applicable today in communications protocols, Internet transactions and cloud computing. They also made fundamental advances in the theory of computational complexity, an area that focuses on classifying computational problems according to their inherent difficulty. 

Goldwasser and Micali were credited for “revolutionizing the science of cryptography” and developing the gold standard for enabling secure Internet transactions. The Turing Award, which is presented annually by the ACM, is often described as the “Nobel Prize in computing” and comes with a $250,000 prize.

“For three decades Shafi and Silvio have been leading the field of cryptography by asking fundamental questions about how we share and receive information. I am thrilled that they have been honored for their pioneering work, and particularly excited that they have been recognized for their achievements as a team,” says Professor Daniela Rus, director of MIT’s Computer Science and Artificial Intelligence Lab (CSAIL). “We are honored and privileged to have this tremendous duo here at CSAIL.”

Goldwasser is the RSA Professor of Electrical Engineering and Computer Science at MIT and a professor of computer science and applied mathematics at the Weizmann Institute of Science in Israel. She leads the Theory of Computation Group at CSAIL. Micali is the Ford Professor of Engineering at MIT and leads the Information and Computer Security Group at CSAIL, along with Goldwasser and Professor Ronald L. Rivest.

“I am delighted that Professors Shafi Goldwasser and Silvio Micali have been recognized and honored with the prestigious ACM Turing Award for their fundamental contributions to the field of provable security. Their work has had a major impact on a broad spectrum of applications touching everyday lives and has opened exciting new research opportunities,” says Anantha Chandrakasan, head of MIT’s Department of Electrical Engineering and Computer Science (EECS). “This is a tremendous honor for the EECS department and inspiring for the large number of students and faculty who have benefited from interactions with Shafi and Silvio.”

Goldwasser and Micali began collaborating as graduate students at the University of California at Berkeley in 1980 while working with Professor Manuel Blum, who received his bachelor’s, master’s and PhD degrees at MIT — and received the Turing Award in 1995. Blum would be the thesis advisor for both of them. While toying around with the idea of how to securely play a game of poker over the phone, they devised a scheme for encrypting and ensuring the security of single bits of data. From there, Goldwasser and Micali proved that their scheme could be scaled up to tackle much more complex problems, such as communications protocols and Internet transactions.

Based on their work, Goldwasser and Micali published a paper in 1982, titled “Probabilistic Encryption,” which laid the framework for modern cryptography. In the paper they introduced formal security definitions, which remain the gold standard for security to this day, and pioneered randomized methods for encryption. Goldwasser and Micali proved that encryption schemes must be randomized rather than deterministic, with many possible encrypted texts corresponding to each message, a development that revolutionized the study of cryptography and laid the foundation for the theory of cryptographic security.

They also introduced the simulation paradigm, which demonstrates a system’s security by showing that an enemy could have simulated all the information he obtained during the employment of a cryptographic system, proving that the cryptographic system poses no risk. The simulation paradigm has become the most widely used method for enabling security in cryptography, going beyond privacy to address problems in authentication and integrity of data, software protection and protocols that involve many participants, such as electronic elections and auctions.

One of Goldwasser and Micali’s most significant contributions is their 1985 paper, with Charles Rackoff, titled “The Knowledge Complexity of Interactive Proof Systems.” It introduced knowledge complexity, a concept that deals with hiding information from an adversary, and is a quantifiable measure of how much “useful information” could be extracted. The paper initiated the idea of “zero-knowledge” proofs, in which interaction (the ability of provers and verifiers to send each other messages back and forth) and probabilism (the ability to toss coins to decide which messages to send) enable the establishment of a fact via a statistical argument without providing any additional information as to why it is true. 

Zero-knowledge proofs were a striking new philosophical idea that provided the essential language for speaking about security of cryptographic protocols by controlling the leakage of knowledge. Subsequent works by Oded Goldreich, Micali and Avi Wigderson, and by Michael Benor, Goldwasser and Wigderson, showed that every multiparty computation could be carried out securely, revealing to the players no more knowledge than prescribed by the desired outcome. These papers exhibited the power and utility of zero-knowledge protocols, and demonstrated their ubiquitous and omnipotent character.

The paper identified interactive proofs as a new method to verify correctness in the exchange of information. Beyond cryptography, interactive proofs can be verified much faster than classical proofs, and can be used in practice to guarantee correctness in a variety of applications such as cloud computing. In a series of works by Goldwasser, Micali and other collaborators, interactive proofs have been extended in several new directions. One direction was to include interactions between a single verifier and multiple provers, which has led to a new way to define and prove NP completeness for approximation problems (an area of active research). Another led to the development of computationally sound proofs, which can be easily compacted and verified, allowing for expedient accuracy checks.

“I am very proud to have won the Turing Award,” Goldwasser says. “Our work was very unconventional at the time. We were graduate students and let our imagination run free, from using randomized methods to encrypt single bits to enlarging the classical definition of a proof to allow a small error to setting new goals for security. Winning the award is further testimony to the fact that the cryptographic and complexity theoretic community embraced these ideas in the last 30 years.”

“I am honored by this recognition and thankful to the computer science community,” Micali adds. “As graduate students, we took some serious risks and faced a few rejections, but also received precious encouragement from exceptional mentors. I am also proud to see how far others have advanced our initial work.”

Past recipients of the Turing Award who have either taught at or earned degrees from MIT include Barbara Liskov (2008), Ronald L. Rivest (2002), Manuel Blum (1995), Butler Lampson (1992), Fernando Corbato (1990), Ivan Sutherland (1988), John McCarthy (1971), Marvin Minsky (1969) and Alan Perlis (1966).

The Turing Award is given annually by the ACM and is named for British mathematician Alan M. Turing, who invented the idea of the computer and who helped the Allies crack the Nazi Enigma cipher during World War II. Goldwasser and Micali will formally receive the award during the ACM’s annual Awards Banquet on June 15 in San Francisco.

By Abby Abazorius, CSAIL

MIT’s graduate program in engineering has been ranked No. 1 in the country in U.S. News & World Report’s annual rankings — a spot the Institute has held since 1990, when the magazine first ranked graduate programs in engineering.

U.S. News awarded MIT a score of 100 among graduate programs in engineering, followed by No. 2 Stanford University (93), No. 3 University of California at Berkeley (87), and No. 4 California Institute of Technology (80).

As was the case last year, MIT’s graduate programs led U.S. News lists in seven engineering disciplines. Top-ranked at MIT this year are programs in aerospace engineering; chemical engineering; materials engineering; computer engineering; electrical engineering (tied with Stanford and Berkeley); mechanical engineering (tied with Stanford); and nuclear engineering (tied with the University of Michigan). MIT’s graduate program in biomedical engineering was also a top-five finisher, tying for third with the University of California at San Diego.

In U.S. News’ first evaluation of PhD programs in the sciences since 2010, five MIT programs earned a No. 1 ranking: biological sciences (tied with Harvard University and Stanford); chemistry (tied with Caltech and Berkeley, and with a No. 1 ranking in the specialty of inorganic chemistry); computer science (tied with Carnegie Mellon University, Stanford, and Berkeley); mathematics (tied with Princeton University, and with a No. 1 ranking in the specialty of discrete mathematics and combinations); and physics. MIT’s graduate program in earth sciences was ranked No. 2.

The MIT Sloan School of Management ranked fifth this year among the nation’s top business schools, behind Harvard Business School, Stanford’s Graduate School of Business, the Wharton School at the University of Pennsylvania, and the Booth School of Business at the University of Chicago.

Sloan’s graduate programs in information systems, production/operations, and supply chain/logistics were again ranked first this year; the Institute’s graduate offerings in entrepreneurship (No. 3) and finance (No. 5) also ranked among top-five programs.

U.S. News does not issue annual rankings for all doctoral programs, but revisits many every few years. In the magazine’s 2013 evaluation of graduate programs in economics, MIT tied for first place with Harvard, Princeton, and Chicago.

U.S. News bases its rankings of graduate schools of engineering and business on two types of data: reputational surveys of deans and other academic officials, and statistical indicators that measure the quality of a school’s faculty, research, and students. The magazine’s less-frequent rankings of programs in the sciences, social sciences, and humanities are based solely on reputational surveys.

By News Office

Every time you open your eyes, visual information flows into your brain, which interprets what you’re seeing. Now, for the first time, MIT neuroscientists have noninvasively mapped this flow of information in the human brain with unique accuracy, using a novel brain-scanning technique.

This technique, which combines two existing technologies, allows researchers to identify precisely both the location and timing of human brain activity. Using this new approach, the MIT researchers scanned individuals’ brains as they looked at different images and were able to pinpoint, to the millisecond, when the brain recognizes and categorizes an object, and where these processes occur.

“This method gives you a visualization of ‘when’ and ‘where’ at the same time. It’s a window into processes happening at the millisecond and millimeter scale,” says Aude Oliva, a principal research scientist in MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL).

Oliva is the senior author of a paper describing the findings in the Jan. 26 issue of Nature Neuroscience. Lead author of the paper is CSAIL postdoc Radoslaw Cichy. Dimitrios Pantazis, a research scientist at MIT’s McGovern Institute for Brain Research, is also an author of the paper.

When and where

Until now, scientists have been able to observe the location or timing of human brain activity at high resolution, but not both, because different imaging techniques are not easily combined. The most commonly used type of brain scan, functional magnetic resonance imaging (fMRI), measures changes in blood flow, revealing which parts of the brain are involved in a particular task. However, it works too slowly to keep up with the brain’s millisecond-by-millisecond dynamics.

Another imaging technique, known as magnetoencephalography (MEG), uses an array of hundreds of sensors encircling the head to measure magnetic fields produced by neuronal activity in the brain. These sensors offer a dynamic portrait of brain activity over time, down to the millisecond, but do not tell the precise location of the signals.

To combine the time and location information generated by these two scanners, the researchers used a computational technique called representational similarity analysis, which relies on the fact that two similar objects (such as two human faces) that provoke similar signals in fMRI will also produce similar signals in MEG. This method has been used before to link fMRI with recordings of neuronal electrical activity in monkeys, but the MIT researchers are the first to use it to link fMRI and MEG data from human subjects.

In the study, the researchers scanned 16 human volunteers as they looked at a series of 92 images, including faces, animals, and natural and manmade objects. Each image was shown for half a second.

“We wanted to measure how visual information flows through the brain. It’s just pure automatic machinery that starts every time you open your eyes, and it’s incredibly fast,” Cichy says. “This is a very complex process, and we have not yet looked at higher cognitive processes that come later, such as recalling thoughts and memories when you are watching objects.”

Each subject underwent the test multiple times — twice in an fMRI scanner and twice in an MEG scanner — giving the researchers a huge set of data on the timing and location of brain activity. All of the scanning was done at the Athinoula A. Martinos Imaging Center at the McGovern Institute.

Millisecond by millisecond

By analyzing this data, the researchers produced a timeline of the brain’s object-recognition pathway that is very similar to results previously obtained by recording electrical signals in the visual cortex of monkeys, a technique that is extremely accurate but too invasive to use in humans.

About 50 milliseconds after subjects saw an image, visual information entered a part of the brain called the primary visual cortex, or V1, which recognizes basic elements of a shape, such as whether it is round or elongated. The information then flowed to the inferotemporal cortex, where the brain identified the object as early as 120 milliseconds. Within 160 milliseconds, all objects had been classified into categories such as plant or animal.

The MIT team’s strategy “provides a rich new source of evidence on this highly dynamic process,” says Nikolaus Kriegeskorte, a principal investigator in cognition and brain sciences at Cambridge University.

“The combination of MEG and fMRI in humans is no surrogate for invasive animal studies with techniques that simultaneously have high spatial and temporal precision, but Cichy et al. come closer to characterizing the dynamic emergence of representational geometries across stages of processing in humans than any previous work. The approach will be useful for future studies elucidating other perceptual and cognitive processes,” says Kriegeskorte, who was not part of the research team.

The MIT researchers are now using representational similarity analysis to study the accuracy of computer models of vision by comparing brain scan data with the models’ predictions of how vision works.

Using this approach, scientists should also be able to study how the human brain analyzes other types of information such as motor, verbal, or sensory signals, the researchers say. It could also shed light on processes that underlie conditions such as memory disorders or dyslexia, and could benefit patients suffering from paralysis or neurodegenerative diseases.

“This is the first time that MEG and fMRI have been connected in this way, giving us a unique perspective,” Pantazis says. “We now have the tools to precisely map brain function both in space and time, opening up tremendous possibilities to study the human brain.”

The research was funded by the National Eye Institute, the National Science Foundation, and a Feodor Lynen Research Fellowship from the Humboldt Foundation.
By Anne Trafton, MIT News Office

For years Dr. Stephanie Seneff has been known throughout the computer science world for her work in natural language processing. A computer scientist with a background in biology, Seneff received her undergraduate degree in biophysics before moving to electrical engineering and computer science for her MS and PhD degrees. For over 30 years, she worked on developing new computational systems aimed at providing insight on biological challenges, including the human auditory system, gene prediction and human language with the goal of improving human-computer interaction.

When her husband was diagnosed with heart disease six years ago and subsequently placed on statin drugs, they were both alarmed when side effects appeared immediately. Seneff vowed to learn more about statin drugs to ensure her husband was receiving the best medical treatment possible. Using the same analytical computer science skills she had developed over the years through her research, Seneff decided to take a similar approach to the study of medicine.

“Drawing on my background in biology, I started reading anything I could find about statin drugs and related research explaining their effects. I also found clinical data about their impact on humans, and was shocked by the adverse effects statin drugs have,” Seneff says. “Being a computer scientist, I started gathering all these drug side effect reports that I found online. I then took all of this information — the research literature and the drugs side effect reports — and applied my computer science programs to decipher the content.”

Using standard computer science techniques, Seneff compared the side effects of statin drugs to other drugs taken by humans within the same age distribution. She found many adverse side effects stemming from the usage of statin drugs, and through this work saw a new path for her research: Using computer science techniques to understand human biology, in particular the relationship between nutrition and health.

Based on her experience developing natural language processing algorithms that could parse phrases and locate key words, Seneff developed a way to exploit computer science methods to help her understand how drugs and environmental toxins impacted human health. To study the interaction between health and environmental factors like drugs and toxic chemicals, Seneff found that it was a natural choice to simply modify the natural language processing algorithm she and her colleagues had already developed to analyze and build summaries of restaurant reviews.

Seneff describes her method as a traditional, systems-level computer science approach to biological problems. The technique works by analyzing specific sections of research papers and testimonials documenting drug side effects for unusual statistical frequencies of key symptom-related words like fatigue or nausea, but also for biological terms like homocysteine and glutathione. Seneff then looks at the correlation statistics to see which keywords tend to appear together in the same paragraph or the same drug side effect report. Based on the statistical data, Seneff applies her findings to the research literature to see if she can draw a connection between any of the symptoms documented in the side effect reports, the biologically related words in the research literature, and environmental toxins or medications being used by the selected population.

“My approach is a systems-level biology approach using simple computer science techniques to help organize the facts. The underlying theme is collecting data from people who are experiencing problems and who live in a particular environment. You examine the data using our computer science techniques, and then you can link specific environments with specific problems. When you go back to the research literature, you can examine what certain materials like glyphosate do to human biology,” says Seneff. “You can then build a hypothesis by studying the research literature and the web-based documentation of symptoms.”

Seneff has applied her technique to understanding everything from Alzheimer’s disease and autism to studying the impact of nutritional deficiencies and environmental toxins like herbicides. Since, 2011 she has published 10 papers in various medical and health-related journals. Her findings have shocked the medical world, as she has shown connections between herbicides like glyphosate and the disruption of the human body’s gut bacteria, depletion of essential amino acids and interference with necessary enzymes, possibly leading to Alzheimer’s, autism, obesity and more.

While Seneff’s approach and findings are unconventional to some, she believes her approach to using computer science to better understand the interaction between human health and environmental toxins provides a new viewpoint that could prove insightful.

“In my work I am using a combination of simple but potent computer science tools to understand biology. I think it’s really important to use computer science as a tool to understand biology and to think of biology at a systems level,” Seneff says. “I am trying to look at the whole system and understand all of the interactions between different parts, which may be a product of my training as a computer scientist. I think computer scientists view the problem differently than biologists.”
By Abby Abazorius | CSAIL

Extending a decades-long run, MIT’s graduate program in engineering has again been ranked No. 1 in the country by U.S. News & World Report. MIT has held the top spot since 1990, when the magazine first ranked graduate programs in engineering.

U.S. News awarded MIT a score of 100 among graduate programs in engineering, followed by No. 2 Stanford University (95), No. 3 University of California at Berkeley (87), and No. 4 California Institute of Technology (78).

MIT’s graduate programs led U.S. News lists in seven engineering disciplines, up from four No. 1 rankings last year. Top-ranked at MIT this year are programs in aerospace engineering (tied with Caltech); chemical engineering; materials engineering; computer engineering; electrical engineering (tied with Stanford); mechanical engineering (tied with Stanford); and nuclear engineering. Other top-five graduate programs at MIT include industrial/manufacturing/systems engineering (No. 3, tied with Northwestern University, Stanford and Berkeley) and biomedical engineering (No. 5).

The MIT Sloan School of Management tied for fourth with Northwestern’s Kellogg School of Management among the nation’s top business schools, scoring 97 in U.S. News’ evaluation — just behind Harvard Business School, Stanford’s Graduate School of Business, and the Wharton School at the University of Pennsylvania.

Sloan’s graduate programs in information systems, production/operations, and supply chain/logistics were again ranked first this year; the Institute’s graduate offerings in entrepreneurship (No. 3) also ranked among top-five programs.

U.S. News does not issue annual rankings for all doctoral programs, but revisits many every few years. Last year, in the magazine’s 2013 evaluation of graduate programs in economics, MIT tied for first place with Harvard University, Princeton University and the University of Chicago. The Institute’s graduate programs in chemistry, computer science, earth sciences, mathematics and physics were all top-ranked or tied for No. 1 in 2010.

U.S. News bases its rankings of graduate schools of engineering and business on two types of data: reputational surveys of deans and other academic officials, and statistical indicators that measure the quality of a school’s faculty, research and students. The magazine’s less-frequent rankings of programs in the sciences, social sciences and humanities are based solely on reputational surveys.
By News Office

The birth of artificial-intelligence research as an autonomous discipline is generally thought to have been the monthlong Dartmouth Summer Research Project on Artificial Intelligence in 1956, which convened 10 leading electrical engineers — including MIT’s Marvin Minsky and Claude Shannon — to discuss “how to make machines use language” and “form abstractions and concepts.” A decade later, impressed by rapid advances in the design of digital computers, Minsky was emboldened to declare that “within a generation … the problem of creating ‘artificial intelligence’ will substantially be solved.”

The problem, of course, turned out to be much more difficult than AI’s pioneers had imagined. In recent years, by exploiting machine learning — in which computers learn to perform tasks from sets of training examples — artificial-intelligence researchers have built special-purpose systems that can do things like interpret spoken language or play Jeopardy with great success. But according to Tomaso Poggio, the Eugene McDermott Professor of Brain Sciences and Human Behavior at MIT, “These recent achievements have, ironically, underscored the limitations of computer science and artificial intelligence. We do not yet understand how the brain gives rise to intelligence, nor do we know how to build machines that are as broadly intelligent as we are.”

Poggio thinks that AI research needs to revive its early ambitions. “It’s time to try again,” he says. “We know much more than we did before about biological brains and how they produce intelligent behavior. We’re now at the point where we can start applying that understanding from neuroscience, cognitive science and computer science to the design of intelligent machines.”

The National Science Foundation (NSF) appears to agree: Today, it announced that one of three new research centers funded through its Science and Technology Centers Integrative Partnerships program will be the Center for Brains, Minds and Machines (CBMM), based at MIT and headed by Poggio. Like all the centers funded through the program, CBMM will initially receive $25 million over five years.

Homegrown initiative

CBMM grew out of the MIT Intelligence Initiative, an interdisciplinary program aimed at understanding how intelligence arises in the human brain and how it could be replicated in machines.

“[MIT President] Rafael Reif, when he was provost, came to speak to the faculty and challenged us to come up with new visions, new ideas,” Poggio says. He and MIT’s Joshua Tenenbaum, also a professor in the Department of Brain and Cognitive Sciences (BCS) and a principal investigator in the Computer Science and Artificial Intelligence Laboratory, responded by proposing a program that would integrate research at BCS and the Department of Electrical Engineering and Computer Science. “With a system as complicated as the brain, there is a point where you need to get people to work together across different disciplines and techniques,” Poggio says. Funded by MIT’s School of Science, the initiative was formally launched, in 2011, at a symposium during MIT’s 150th anniversary.

Headquartered at MIT, CBMM will be, like all the NSF centers, a multi-institution collaboration. Of the 20 faculty members currently affiliated with the center, 10 are from MIT, five are from Harvard University, and the rest are from Cornell University, Rockefeller University, the University of California at Los Angeles, Stanford University and the Allen Institute for Brain Science. The center’s international partners are the Italian Institute of Technology; the Max Planck Institute in Germany; City University of Hong Kong; the National Centre for Biological Sciences in India; and Israel’s Weizmann Institute and Hebrew University. Its industrial partners are Google, Microsoft, IBM, Mobileye, Orcam, Boston Dynamics, Willow Garage, DeepMind and Rethink Robotics. Also affiliated with center are Howard University; Hunter College; Universidad Central del Caribe, Puerto Rico; the University of Puerto Rico, Río Piedras; and Wellesley College.

CBMM aims to foster collaboration not just between institutions but also across disciplinary boundaries. Graduate students and postdocs funded through the center will have joint advisors, preferably drawn from different research areas.

Research themes

The center’s four main research themes are also intrinsically interdisciplinary. They are the integration of intelligence, including vision, language and motor skills; circuits for intelligence, which will span research in neurobiology and electrical engineering; the development of intelligence in children; and social intelligence. Poggio will also lead the development of a theoretical platform intended to undergird the work in all four areas.

“Those four thrusts really do fit together, in the sense that they cover what we think are the biggest challenges facing us when we try to develop a computational understanding of what intelligence is all about,” says Patrick Winston, the Ford Foundation Professor of Engineering at MIT and research coordinator for CBMM.

For instance, he explains, in human cognition, vision, language and motor skills are inextricably linked, even though they’ve been treated as separate problems in most recent AI research. One of Winston’s favorite examples is that of image labeling: A human subject will identify an image of a man holding a glass to his lips as that of a man drinking. If the man is holding the glass a few inches further forward, it’s an instance of a different activity — toasting. But a human will also identify an image of a cat turning its head up to catch a few drops of water from a faucet as an instance of drinking. “You have to be thinking about what you see there as a story,” Winston says. “They get the same label because it’s the same story, not because it looks the same.”

Similarly, Winston explains, development is its own research thrust because intelligence is fundamentally shaped through interaction with the environment. There’s evidence, Winston says, that mammals that receive inadequate visual stimulation in the first few weeks of life never develop functional eyesight, even though their eyes are otherwise unimpaired. “You need to stimulate the neural mechanisms in order for them to assemble themselves into a functioning system,” Winston says. “We think that that’s true generally, of our entire spectrum of capabilities. You need to have language, you need to see things, you need to have language and vision work together from the beginning to ensure that the parts develop properly to form a working whole.”
By Larry Hardesty, MIT News Office

For many companies, moving their web-application servers to the cloud is an attractive option, since cloud-computing services can offer economies of scale, extensive technical support and easy accommodation of demand fluctuations.

But for applications that depend heavily on database queries, cloud hosting can pose as many problems as it solves. Cloud services often partition their servers into “virtual machines,” each of which gets so many operations per second on a server’s central processing unit, so much space in memory, and the like. That makes cloud servers easier to manage, but for database-intensive applications, it can result in the allocation of about 20 times as much hardware as should be necessary. And the cost of that overprovisioning gets passed on to customers.

MIT researchers are developing a new system called DBSeer that should help solve this problem and others, such as the pricing of cloud services and the diagnosis of application slowdowns. At the recent Biennial Conference on Innovative Data Systems Research, the researchers laid out their vision for DBSeer. And in June, at the annual meeting of the Association for Computing Machinery’s Special Interest Group on Management of Data (SIGMOD), they will unveil the algorithms at the heart of DBSeer, which use machine-learning techniques to build accurate models of performance and resource demands of database-driven applications.

DBSeer’s advantages aren’t restricted to cloud computing, either. Teradata, a major database company, has already assigned several of its engineers the task of importing the MIT researchers’ new algorithm — which has been released under an open-source license — into its own software.

Virtual limitations

Barzan Mozafari, a postdoc in the lab of professor of electrical engineering and computer science Samuel Madden and lead author on both new papers, explains that, with virtual machines, server resources must be allocated according to an application’s peak demand. “You’re not going to hit your peak load all the time,” Mozafari says. “So that means that these resources are going to be underutilized most of the time.”

Moreover, Mozafari says, the provisioning for peak demand is largely guesswork. “It’s very counterintuitive,” Mozafari says, “but you might take on certain types of extra load that might help your overall performance.” Increased demand means that a database server will store more of its frequently used data in its high-speed memory, which can help it process requests more quickly.

On the other hand, a slight increase in demand could cause the system to slow down precipitously — if, for instance, too many requests require modification of the same pieces of data, which need to be updated on multiple servers. “It’s extremely nonlinear,” Mozafari says.

Mozafari, Madden, postdoc Alekh Jindal, and Carlo Curino, a former member of Madden’s group who’s now at Microsoft, use two different techniques in the SIGMOD paper to predict how a database-driven application will respond to increased load. Mozafari describes the first as a “black box” approach: DBSeer simply monitors fluctuations in both the number and type of user requests and system performance and uses machine-learning techniques to correlate the two. This approach is good at predicting the consequences of fluctuations that don’t fall too far outside the range of the training data.

Gray areas

Often, however, database managers — or prospective cloud-computing customers — will be interested in the consequences of a fourfold, tenfold, or even hundredfold increase in demand. For those types of predictions, Mozafari explains, DBSeer uses a “gray box” model, which takes into account the idiosyncrasies of particular database systems.

For instance, Mozafari explains, updating data stored on a hard drive is time-consuming, so most database servers will try to postpone that operation as long as they can, instead storing data modifications in the much faster — but volatile — main memory. At some point, however, the server has to commit its pending modifications to disk, and the criteria for making that decision can vary from one database system to another.

The version of DBSeer presented at SIGMOD includes a gray-box model of MySQL, one of the most widely used database systems. The researchers are currently building a new model for another popular system, PostgreSQL. Although adapting the model isn’t a negligible undertaking, models tailored to just a handful of systems would cover the large majority of database-driven Web applications.

The researchers tested their prediction algorithm against both a set of benchmark data, called TPC-C, that’s commonly used in database research and against real-world data on modifications to the Wikipedia database. On average, the model was about 80 percent accurate in predicting CPU use and 99 percent accurate in predicting the bandwidth consumed by disk operations.

“We’re really fascinated and thrilled that someone is doing this work,” says Doug Brown, a database software architect at Teradata. “We’ve already taken the code and are prototyping right now.” Initially, Brown says, Teradata will use the MIT researchers’ prediction algorithm to determine customers’ resource requirements. “The really big question for our customers is, ‘How are we going to scale?’” Brown says.

Brown hopes, however, that the algorithm will ultimately help allocate server resources on the fly, as database requests come in. If servers can assess the demands imposed by individual requests and budget accordingly, they can ensure that transaction times stay within the bounds set by customers’ service agreements. For instance, “if you have two big, big resource consumers, you can calculate ahead of time that we’re only going to run two of these in parallel,” Brown says. “There’s all kinds of games you can play in workload management.”
By Larry Hardesty, MIT News Office

With recent advances in three-dimensional (3-D) printing technology, it is now possible to produce a wide variety of 3-D objects, utilizing computer graphics models and simulations. But while the hardware exists to reproduce complex, multi-material objects, the software behind the printing process is cumbersome, slow and difficult to use, and needs to improve substantially if 3-D technology is to become more mainstream.

On July 25, a team of researchers from the MIT Computer Science and Artificial Intelligence Lab (CSAIL) will present two papers at the SIGGRAPH computer graphics conference in Anaheim, California, which propose new methods for streamlining and simplifying the 3-D printing process, utilizing more efficient, intuitive and accessible technologies.

“Our goal is to make 3-D printing much easier and less computationally complex,” said Associate Professor Wojciech Matusik, co-author of the papers and a leader of the Computer Graphics Group at CSAIL. “Ours is the first work that unifies design, development and implementation into one seamless process, making it possible to easily translate an object from a set of specifications into a fully operational 3-D print.”

3-D printing poses enormous computational challenges to existing software. For starters, in order to fabricate complex surfaces containing bumps, color gradations and other intricacies, printing software must produce an extremely high-resolution model of the object, with detailed information on each surface that is to be replicated. Such models often amount to petabytes of data, which current programs have difficulty processing and storing.

To address these challenges, Matusik and his team developed OpenFab, a programmable “pipeline” architecture. Inspired by RenderMan, the software used to design computer-generated imagery commonly seen in movies, OpenFab allows for the production of complex structures with varying material properties. To specify intricate surface details and the composition of a 3-D object, OpenFab uses “fablets”, programs written in a new programming language that allow users to modify the look and feel of an object easily and efficiently.

“Our software pipeline makes it easier to design and print new materials and to continuously vary the properties of the object you are designing,” said Kiril Vidimče, lead author of one of the two papers and a PhD student at CSAIL. “In traditional manufacturing most objects are composed of multiple parts made out of the same material. With OpenFab, the user can change the material consistency of an object, for example designing the object to transition from stiff at one end to flexible and compressible at the other end.”

Thanks to OpenFab’s streaming architecture, data about the design of the 3-D object is computed on demand and sent to the printer as it becomes available, with little start-up delay. So far, Matusik’s research team has been able to replicate a wide array of objects using OpenFab, including an insect embedded in amber, a marble table and a squishy teddy bear.

In order to create lifelike objects that are hard, soft, reflect light and conform to touch, users must currently specify the material composition of the object they wish to replicate. This is no easy task, as it’s often easier to define the desired end-state of an object — for example, saying that an object needs to be soft — than to determine which materials should be used in its production.

To simplify this process, Matusik and his colleagues developed a new methodology called Spec2Fab. Instead of requiring explicit design specifications for each region of a print, and testing every possible combination, Spec2Fab employs a “reducer tree”, which breaks the object down into more manageable chunks. Spec2Fab’s “tuner network” then uses the reducer tree to automatically determine the material composition of an object.

By combining existing computer graphics algorithms, Matusik’s team has used Spec2Fab to create a multitude of 3-D prints, creating optical effects like caustic images and objects with specific deformation and textural properties.

“Spec2Fab is a small but powerful toolbox for building algorithms that can produce an endless array of complex, printable objects,” said Desai Chen, a PhD student at CSAIL and lead author of one of the papers presented at SIGGRAPH.

The two papers to be presented at SIGGRAPH are “OpenFab: A Programmable Pipeline for Multi-Material Fabrication,” authored by Kiril Vidimče, Szu-Po Wang, Jonathan Ragan-Kelley and Wojciech Matusik; and “Spec2Fab: A Reducer-Tuner Model for Translating Specifications to 3-D Prints,” authored by Desai Chen, David I. W. Levin, Piotr Didyk, Pitchaya Sitthi-Amorn and Wojciech Matusik.

For more information on OpenFab, please visit: http://openfab.mit.edu/. For more information on Spec2Fab, please visit: http://spec2fab.mit.edu.

By Abby Abazorius | CSAIL

“The sounds uttered by birds offer in several respects the nearest analogy to language,” Charles Darwin wrote in “The Descent of Man” (1871), while contemplating how humans learned to speak. Language, he speculated, might have had its origins in singing, which “might have given rise to words expressive of various complex emotions.”

Now researchers from MIT, along with a scholar from the University of Tokyo, say that Darwin was on the right path. The balance of evidence, they believe, suggests that human language is a grafting of two communication forms found elsewhere in the animal kingdom: first, the elaborate songs of birds, and second, the more utilitarian, information-bearing types of expression seen in a diversity of other animals.

“It’s this adventitious combination that triggered human language,” says Shigeru Miyagawa, a professor of linguistics in MIT’s Department of Linguistics and Philosophy, and co-author of a new paper published in the journal Frontiers in Psychology.

The idea builds upon Miyagawa’s conclusion, detailed in his previous work, that there are two “layers” in all human languages: an “expression” layer, which involves the changeable organization of sentences, and a “lexical” layer, which relates to the core content of a sentence. His conclusion is based on earlier work by linguists including Noam Chomsky, Kenneth Hale and Samuel Jay Keyser.

Based on an analysis of animal communication, and using Miyagawa’s framework, the authors say that birdsong closely resembles the expression layer of human sentences — whereas the communicative waggles of bees, or the short, audible messages of primates, are more like the lexical layer. At some point, between 50,000 and 80,000 years ago, humans may have merged these two types of expression into a uniquely sophisticated form of language.

“There were these two pre-existing systems,” Miyagawa says, “like apples and oranges that just happened to be put together.”

These kinds of adaptations of existing structures are common in natural history, notes Robert Berwick, a co-author of the paper, who is a professor of computational linguistics in MIT’s Laboratory for Information and Decision Systems, in the Department of Electrical Engineering and Computer Science.

“When something new evolves, it is often built out of old parts,” Berwick says. “We see this over and over again in evolution. Old structures can change just a little bit, and acquire radically new functions.”

A new chapter in the songbook

The new paper, “The Emergence of Hierarchical Structure in Human Language,” was co-written by Miyagawa, Berwick and Kazuo Okanoya, a biopsychologist at the University of Tokyo who is an expert on animal communication.

To consider the difference between the expression layer and the lexical layer, take a simple sentence: “Todd saw a condor.” We can easily create variations of this, such as, “When did Todd see a condor?” This rearranging of elements takes place in the expression layer and allows us to add complexity and ask questions. But the lexical layer remains the same, since it involves the same core elements: the subject, “Todd,” the verb, “to see,” and the object, “condor.”

Birdsong lacks a lexical structure. Instead, birds sing learned melodies with what Berwick calls a “holistic” structure; the entire song has one meaning, whether about mating, territory or other things. The Bengalese finch, as the authors note, can loop back to parts of previous melodies, allowing for greater variation and communication of more things; a nightingale may be able to recite from 100 to 200 different melodies.

By contrast, other types of animals have bare-bones modes of expression without the same melodic capacity. Bees communicate visually, using precise waggles to indicate sources of foods to their peers; other primates can make a range of sounds, comprising warnings about predators and other messages.

Humans, according to Miyagawa, Berwick and Okanoya, fruitfully combined these systems. We can communicate essential information, like bees or primates — but like birds, we also have a melodic capacity and an ability to recombine parts of our uttered language. For this reason, our finite vocabularies can generate a seemingly infinite string of words. Indeed, the researchers suggest that humans first had the ability to sing, as Darwin conjectured, and then managed to integrate specific lexical elements into those songs.

“It’s not a very long step to say that what got joined together was the ability to construct these complex patterns, like a song, but with words,” Berwick says.

As they note in the paper, some of the “striking parallels” between language acquisition in birds and humans include the phase of life when each is best at picking up languages, and the part of the brain used for language. Another similarity, Berwick notes, relates to an insight of celebrated MIT professor emeritus of linguistics Morris Halle, who, as Berwick puts it, observed that “all human languages have a finite number of stress patterns, a certain number of beat patterns. Well, in birdsong, there is also this limited number of beat patterns.”

Birds and bees

Norbert Hornstein, a professor of linguistics at the University of Maryland, says the paper has been “very well received” among linguists, and “perhaps will be the standard go-to paper for language-birdsong comparison for the next five years.”

Hornstein adds that he would like to see further comparison of birdsong and sound production in human language, as well as more neuroscientific research, pertaining to both birds and humans, to see how brains are structured for making sounds.

The researchers acknowledge that further empirical studies on the subject would be desirable.

“It’s just a hypothesis,” Berwick says. “But it’s a way to make explicit what Darwin was talking about very vaguely, because we know more about language now.”

Miyagawa, for his part, asserts it is a viable idea in part because it could be subject to more scrutiny, as the communication patterns of other species are examined in further detail. “If this is right, then human language has a precursor in nature, in evolution, that we can actually test today,” he says, adding that bees, birds and other primates could all be sources of further research insight.

MIT-based research in linguistics has largely been characterized by the search for universal aspects of all human languages. With this paper, Miyagawa, Berwick and Okanoya hope to spur others to think of the universality of language in evolutionary terms. It is not just a random cultural construct, they say, but based in part on capacities humans share with other species. At the same time, Miyagawa notes, human language is unique, in that two independent systems in nature merged, in our species, to allow us to generate unbounded linguistic possibilities, albeit within a constrained system.

“Human language is not just freeform, but it is rule-based,” Miyagawa says. “If we are right, human language has a very heavy constraint on what it can and cannot do, based on its antecedents in nature.”
By Peter Dizikes, MIT News Office