
Wednesday, October 28, 2009
Tuesday, June 23, 2009
Friday, May 1, 2009
Wolfram on Wolfram Alfa
"Some might say that Mathematica and A New Kind of Science are ambitious projects.
But in recent years I’ve been hard at work on a still more ambitious project—called Wolfram|Alpha.
And I’m excited to say that in just two months it’s going to be going live:
Mathematica has been a great success in very broadly handling all kinds of formal technical systems and knowledge.
But what about everything else? What about all other systematic knowledge? All the methods and models, and data, that exists?
Fifty years ago, when computers were young, people assumed that they’d quickly be able to handle all these kinds of things and that one would be able to ask a computer any factual question, and have it compute the answer.
But it didn’t work out that way. Computers have been able to do many remarkable and unexpected things. But not that.
I’d always thought, though, that eventually it should be possible. And a few years ago, I realized that I was finally in a position to try to do it.
I had two crucial ingredients: Mathematica and NKS. With Mathematica, I had a symbolic language to represent anything—as well as the algorithmic power to do any kind of computation. And with NKS, I had a paradigm for understanding how all sorts of complexity could arise from simple rules.
But what about all the actual knowledge that we as humans have accumulated?
A lot of it is now on the web—in billions of pages of text. And with search engines, we can very efficiently search for specific terms and phrases in that text.
But we can’t compute from that. And in effect, we can only answer questions that have been literally asked before. We can look things up, but we can’t figure anything new out.
So how can we deal with that? Well, some people have thought the way forward must be to somehow automatically understand the natural language that exists on the web. Perhaps getting the web semantically tagged to make that easier.
But armed with Mathematica and NKS I realized there’s another way: explicitly implement methods and models, as algorithms, and explicitly curate all data so that it is immediately computable.
It’s not easy to do this. Every different kind of method and model—and data—has its own special features and character. But with a mixture of Mathematica and NKS automation, and a lot of human experts, I’m happy to say that we’ve gotten a very long way.
How can I say it?
"But, OK. Let’s say we succeed in creating a system that knows a lot, and can figure a lot out. How can we interact with it?
The way humans normally communicate is through natural language. And when one’s dealing with the whole spectrum of knowledge, I think that’s the only realistic option for communicating with computers too.
Of course, getting computers to deal with natural language has turned out to be incredibly difficult. And for example we’re still very far away from having computers systematically understand large volumes of natural language text on the web.
But if one’s already made knowledge computable, one doesn’t need to do that kind of natural language understanding.
All one needs to be able to do is to take questions people ask in natural language, and represent them in a precise form that fits into the computations one can do.
Of course, even that has never been done in any generality. And it’s made more difficult by the fact that one doesn’t just want to handle a language like English: one also wants to be able to handle all the shorthand notations that people in every possible field use.
I wasn’t at all sure it was going to work. But I’m happy to say that with a mixture of many clever algorithms and heuristics, lots of linguistic discovery and linguistic curation, and what probably amount to some serious theoretical breakthroughs, we’re actually managing to make it work.
Neverending trillions
"Pulling all of this together to create a true computational knowledge engine is a very difficult task.
It’s certainly the most complex project I’ve ever undertaken. Involving far more kinds of expertise—and more moving parts—than I’ve ever had to assemble before.
And—like Mathematica, or NKS—the project will never be finished.
But I’m happy to say that we’ve almost reached the point where we feel we can expose the first part of it.
It’s going to be a website: www.wolframalpha.com. With one simple input field that gives access to a huge system, with trillions of pieces of curated data and millions of lines of algorithms.
We’re all working very hard right now to get Wolfram|Alpha ready to go live.
I think it’s going to be pretty exciting. A new paradigm for using computers and the web.
That almost gets us to what people thought computers would be able to do 50 years ago!
Due Soon: Wolfram Alpha
Wolfram Alpha differs from search engines in that it does not simply return a list of results based on a keyword, but instead computes answers and relevant visualizations from a collection of known information. Other new search engines, known collectively as semantic search engines, have developed alpha applications of this type, which index a large amount of answers, and then try to match the question to one. Examples of companies using this strategy include True Knowledge, and Microsoft's Powerset.
Wolfram Alpha has many parallels with Cyc, a project aimed at developing a common-sense inference engine since the 80s, though without producing any major commercial application. Cyc founder Douglas Lenat was one of the few given an opportunity to test Wolfram Alpha before its release:
It handles a much wider range of queries than Cyc, but much narrower than Google; it understands some of what it is displaying as an answer, but only some of it ... The bottom line is that there are a large range of queries it can't parse, and a large range of parsable queries it can't answer
-Douglas Lenat[2]
Wolfram's earlier flagship product Mathematica encompasses computer algebra, numerical computation, visualization and statistics capabilities and can be used on all kinds of mathematical analysis, from simple plotting to signal processing, but will not be included in the alpha release, due to computation-time problems.[3]
Monday, January 5, 2009
Thursday, November 13, 2008
Aplicações do Google Insights - 1
|
2
|
3
|
Google Uses Searches to Track Flu’s Spread
That simple act, multiplied across millions of keyboards in homes around the country, has given rise to a new early warning system for fast-spreading flu outbreaks, called Google Flu Trends.
Tests of the new Web tool from Google.org, the company’s philanthropic unit, suggest that it may be able to detect regional outbreaks of the flu a week to 10 days before they are reported by the Centers for Disease Control and Prevention.
In early February, for example, the C.D.C. reported that the flu cases had recently spiked in the mid-Atlantic states. But Google says its search data show a spike in queries about flu symptoms two weeks before that report was released. Its new service at google.org/flutrends analyzes those searches as they come in, creating graphs and maps of the country that, ideally, will show where the flu is spreading.
The C.D.C. reports are slower because they rely on data collected and compiled from thousands of health care providers, labs and other sources. Some public health experts say the Google data could help accelerate the response of doctors, hospitals and public health officials to a nasty flu season, reducing the spread of the disease and, potentially, saving lives.
“The earlier the warning, the earlier prevention and control measures can be put in place, and this could prevent cases of influenza,” said Dr. Lyn Finelli, lead for surveillance at the influenza division of the C.D.C. From 5 to 20 percent of the nation’s population contracts the flu each year, she said, leading to roughly 36,000 deaths on average.
The service covers only the United States, but Google is hoping to eventually use the same technique to help track influenza and other diseases worldwide.
“From a technological perspective, it is the beginning,” said Eric E. Schmidt, Google’s chief executive.
The premise behind Google Flu Trends — what appears to be a fruitful marriage of mob behavior and medicine — has been validated by an unrelated study indicating that the data collected by Yahoo, Google’s main rival in Internet search, can also help with early detection of the flu.
“In theory, we could use this stream of information to learn about other disease trends as well,” said Dr. Philip M. Polgreen, assistant professor of medicine and epidemiology at the University of Iowa and an author of the study based on Yahoo’s data.
Still, some public health officials note that many health departments already use other approaches, like gathering data from visits to emergency rooms, to keeping daily tabs on disease trends in their communities.
“We don’t have any evidence that this is more timely than our emergency room data,” said Dr. Farzad Mostashari, assistant commissioner of the Department of Health and Mental Hygiene in New York City.
If Google provided health officials with details of the system’s workings so that it could be validated scientifically, the data could serve as an additional, free way to detect influenza, said Dr. Mostashari, who is also chairman of the International Society for Disease Surveillance.
A paper on the methodology of Google Flu Trends is expected to be published in the journal Nature.
Researchers have long said that the material published on the Web amounts to a form of “collective intelligence” that can be used to spot trends and make predictions.
But the data collected by search engines is particularly powerful, because the keywords and phrases that people type into them represent their most immediate intentions. People may search for “Kauai hotel” when they are planning a vacation and for “foreclosure” when they have trouble with their mortgage. Those queries express the world’s collective desires and needs, its wants and likes.
Internal research at Yahoo suggests that increases in searches for certain terms can help forecast what technology products will be hits, for instance. Yahoo has begun using search traffic to help it decide what material to feature on its site.
Two years ago, Google began opening its search data trove through Google Trends, a tool that allows anyone to track the relative popularity of search terms. Google also offers more sophisticated search traffic tools that marketers can use to fine-tune ad campaigns. And internally, the company has tested the use of search data to reach conclusions about economic, marketing and entertainment trends.
“Most forecasting is basically trend extrapolation,” said Hal Varian, Google’s chief economist. “This works remarkably well, but tends to miss turning points, times when the data changes direction. Our hope is that Google data might help with this problem.”
Prabhakar Raghavan, who is in charge of Yahoo Labs and the company’s search strategy, also said search data could be valuable for forecasters and scientists, but privacy concerns had generally stopped it from sharing it with outside academics.
Google Flu Trends avoids privacy pitfalls by relying only on aggregated data that cannot be traced to individual searchers. To develop the service, Google’s engineers devised a basket of keywords and phrases related to the flu, including thermometer, flu symptoms, muscle aches, chest congestion and many others.
Google then dug into its database, extracted five years of data on those queries and mapped it onto the C.D.C.’s reports of influenzalike illness. Google found a strong correlation between its data and the reports from the agency, which advised it on the development of the new service.
“We know it matches very, very well in the way flu developed in the last year,” said Dr. Larry Brilliant, executive director of Google.org. Dr. Finelli of the C.D.C. and Dr. Brilliant both cautioned that the data needed to be monitored to ensure that the correlation with flu activity remained valid.
Google also says it believes the tool may help people take precautions if a disease is in their area.
Others have tried to use information collected from Internet users for public health purposes. A Web site called whoissick.org, for instance, invites people to report what ails them and superimposes the results on a map. But the site has received relatively little traffic.
HealthMap, a project affiliated with the Children’s Hospital Boston, scours the Web for articles, blog posts and newsletters to create a map that tracks emerging infectious diseases around the world. It is backed by Google.org, which counts the detection and prevention of diseases as one of its main philanthropic objectives.
But Google Flu Trends appears to be the first public project that uses the powerful database of a search engine to track a disease.
“This seems like a really clever way of using data that is created unintentionally by the users of Google to see patterns in the world that would otherwise be invisible,” said Thomas W. Malone, a professor at the Sloan School of Management at M.I.T. “I think we are just scratching the surface of what’s possible with collective intelligence.”A version of this article appeared in print on November 12, 2008, on page A1 of the New York edition.