1. Home >
  2. Extreme

Interview: IBM's Jerome Pesenti on Watson Developer Cloud

The Watson team is on the cutting edge of machine learning, and they're actively partnering with other research groups to advance the field as quickly as possible. Recently, we had the opportunity to talk to Jerome Pesenti, Vice President of Watson Core Technology, about what IBM has been working on lately, and how it will impact end-users.
By Grant Brunner
Watson Logo

Watson, the cognitive computing platform from IBM, first came to prominence when it made an impressive appearance on Jeopardy in 2011. Since then, IBM has taken that core machine learning technology, and made it available to the world through its Bluemix cloud computing platform.

Dubbed the "Watson Developer Cloud," this selection of APIs allows third-party developers to harness IBM's research, and build applications and websites that offer impressive functionality. For example, one of the latest beta APIs actually analyzes the tone of your writing, and IBM released a public demo that will recommend alternative wording(Opens in a new window).

Since the initial public launch, the Watson team has continued to frequently roll out new functionality and publish peer-reviewed papers on their findings. They're on the cutting edge of machine learning, and they're actively partnering with other research groups to advance the field as quickly as possible. Recently, I had the opportunity to talk to Jerome Pesenti, Vice President of Watson Core Technology, about what IBM has been working on lately, and how it will impact end-users. Below is the transcript of our conversation (edited slightly for clarity).

Grant Brunner: Tell me a little bit about your professional background, and how you came to the Watson project. Jerome Pesenti: Sure. I was a post-doc at Carnegie Mellon University, and I started a company out of there[...] I grew it to kind of a hundred -- 135 people, and I got acquired by IBM in 2012 -- I was a founder of the company that got acquired. So I started at IBM, and in about a year, I was offered a position within the Watson group that was just created in 2014. So today, I have actually a pretty large R&D team within Watson. I'm in charge of both core technology and the platform. So, I take care of looking at the core algorithms and functions, and exposing them as cloud services. Watson Logo There's really strong cloud strategy within Watson to expose everything we do as very easy-to-consume services within the cloud. So, the focus is on core cognitive technologies and developing new algorithms, so there's a lot of ongoing effort right now around deep learning[...] I have five teams working under me on some deep learning aspects. And then I'm collaborating a lot with research, which has a number of other teams working on this problem.

And then, so we're trying to take these advanced technologies, and expose them in Watson Developer Cloud -- in our cloud platform -- as quickly as possible to a very quick cycle. Just to tie it back to the announcements, just last week we announced three new services, with a strategy of putting new services out there. So, it puts services around transcribing speech into text, converting text into speech, and translating from language to language.

And then, we also announced a partnership with [the Montreal Institute for Learning Algorithms] where you have one of the big three in deep learning: Yoshua Bengio. And so we have a research partnership with them that's starting, and with lots of interaction -- lots of interesting projects.

GB: Bluemix is IBM's cloud platform, but how does Watson play into that? What makes the Watson subset different from other aspects of Bluemix? JP: So, you understand that when I talk cloud services, everything is exposed through Bluemix, right? GB: Right. JP: So, we're really a section of Bluemix. So the key is to define what makes Watson. In general, it's to create the systems that use some machine learning or some natural language processing or a combination of the two[...] So in general, it means the kind of human functions whose result is not necessarily clearly defined. It's not about a function where you know what the outcome is going to be. Because it's like humans, right? Humans make judgements. So even when you're turning phonemes into speech, there's not a 100% way of transcribing it because the sounds are vague in some cases[...] Next page: Learn about the new APIs and the partnership with MILA GB: What, exactly, is the relationship with the MILA? JP: I believe we have seven different research teams, including the five under me, that are basically collaborating with their research team. So, we have common projects that we are working together on, and that will hopefully lead to some research that's published out there.

So, we're giving them some resources. They're giving us some attention, and our goal is to try to direct a little bit more their views on researching problems that we know will lead to outcomes that will bring value to our customers. We don't want to just create for the sake of it to show "hey, it's fun, the system can show human-like behavior," we're also trying to think "okay, how do we take this, and give value to our customers."

But the fun part of what's happening is that you can go from fundamental research to value-to-customer in a very short cycle[...] So, it's a very interesting time to do this kind of research.

GB: What makes Watson’s machine learning stand out from the competition? JP: Well, I think it's three things. One is, obviously, Watson. You know where that comes from, right? It comes from 2011, and IBM made a big gamble to show on TV, right[...] If you read a little bit, it's a very interesting history. When we started the project around 2006/2007, there was no idea what we could actually accomplish -- the state of the art at the time was very low. Watson Art So we went on TV, and we showed that we could do it, and there was a lot of credit to get from this because it shows that we know how to take this kind of technology, and create a system that has very good performance in a completely realistic environment. You cannot control anything. We didn't control the questions that were there, so I think that's the first thing.

Thanks to this, we were exposed to a lot of interesting partners. In the medical space, we've been working with the three top cancer centers (MD Anderson, Memorial Sloan Kettering, and Mayo clinic) to create Watson in health care. And now we have hundreds of partners and clients that we're working with in something like 36 different industries[...]

I think we're in the lead of trying to apply these technologies to real-world problems. And coming now, not just from Jeopardy, but coming from our customers, and we've been at it for four years. We kind of declared this space before anyone got there[...] We are offering a wide range of services from vision to speech to dialog to natural language to personality[...] We have a very broad set, I believe the broadest of anybody out there, and we're exposing it on a platform for anyone to build on.

I would say that's three things you asked me. One is we demoed it on Jeopardy. Two is we're working with an order of magnitude more customer use cases than anybody else. And three, we have a very broad set of technologies that we're exposing out there.

GB: The three APIs you mentioned were previously launched in beta earlier this year. What's new? JP: Yeah, so this is a [General Availability Release], so people can actually build applications on top of it. Now, I understand these APIs, per se, are not new. IBM has actually been working in the space of speech for more than 40 years. Now, the key is that we're trying to expose it as services that other people can use to build applications[...]

There are two things that are remarkable about this. One is that we've actually published a few interesting papers in the last few weeks that showed on some common benchmarks we actually gained very good performance.

My team published a paper just a few weeks ago that showed an error rate of eight percent on a benchmark for phone conversations. One of the use cases we're looking at is not just giving commands to your phone, but it's also about, can I transcribe the speech that happened in a call center when an agent and a person are talking? So, there we've shown very good performance with lots of research happening[...] I think the next [lowest published error rate from a competitor] is at 12.6 percent -- which is a huge improvement.

And the second piece is that as we put the services out there, we are building a platform. So you see things coming where it's easy to put the services together. We have common scenarios for adaptation. When you put a service out there, you want people to be able to customize it -- add their own knowledge to it. How do you pronounce a word? How do you spell a word? The interesting thing about the Watson Developer Cloud is that we're creating a platform where we put all of this core functionality that people will be able to adapt in a consistent way.

Next page: Do you need to understand machine learning to use Watson? GB: What are the short-term goals and long-term goals of the Watson project? JP: Short-term, we're seeing a tremendous amount of interest from customers, and we are closing business at a fast pace. And we're in the process of basically making this technology ubiquitous. I think we have enough assets today to get pretty much any large company out there to leverage the technology to improve their business processes. So, our goal short-term is to get that adoption, and get people live and productive with it[...]

Our long-term goal is to just expand the platform and improve. We're always working on improving each of the services. I talked to you earlier about speech with an eight percent error rate. Human error rate is around 4.5 percent. Our goal is to get to that level. It sounds like a small difference, but every point that you decrease means less errors when you do transcription, or when you talk to your phone. It makes your system much more usable. It expands, completely, the use case.

And even with the technology we have, we have a huge market. But every time we make the technology more flexible, more accurate, expand the reach of it, we will expand tremendously the size of the market as well. That's our mid-term and long-term goal.

GB: If someone reading this wants to use Watson technology in an app or website, do they need specific knowledge of machine learning, or will general programming experience suffice? JP: Right now, if you look at the services we put out there, we tried to shrink wrap them so that anybody without machine learning skills or language skills can use them. So you can go [to the Bluemix site], and anybody will be able to look at the demos, look at the APIs, and do something. We were trying to simplify to make it possible for people to do that. And many of the services just kind of work out of the box, and you can start using them[...]

When you start to create a really robust application, if you have a decent understanding of some machine learning concepts[...] it can end up being useful.

By the way, Watson is not a general purpose machine learning platform. You're not just running standard [things like] binary trees. It's not just a blank machine learning platform -- it's really about applying it to tasks. So, you can use it without a [machine learning background] -- definitely. But as you're putting an application together, right now, we still find that it's good to understand what it means.

I can give you an example. We're trying to enable our customers to build question and answering agents[...] And when you develop that, the trick is, you need to develop an application that people can ask questions in any possible way. If the system just knows one way to formulate a question, it's not very interesting. You don't want to tell your user, "You have to ask your question this way." You don't tell them anything. They look to the app, and they ask it.

So when you build an application, you need to kind of figure out how well it's going to perform in the real world[...] And often, when people do this, it's very easy to create a system that you can fake it. You know, you go there, you know what kind of question the system knows well, so you show it to your boss. And you ask these questions, and the response is perfect. And then you ask this other question... And you can find on the Web the people who have done these demos.

But at the end, when you put it in front of the real world, the thing will completely fall apart, because it doesn't handle all of the subtleties of the particular language. And to handle that, you have to have this concept of a blind set -- a blind test set. You know, you need to test your system with something you didn't use to design it.

That's a concept that people have a hard time wrapping their head around, because people usually have a functional view of testing. "These are the things I need to test, and I know exactly what the error should be on the output there. And I've designed my system based on my test." But here, you basically need to put your tests aside, and say, "My test is actually real humans using the system[...]" And then I'm going to design my system without thinking about it, and this blind set will give me a representative view of how well it works.

GB: The way I understand it, the goal is to design Watson to be robust enough that when an end-user interacts with an app or website using Watson technology, the experience is seamless. JP: That's right. And the point is that when you design this robust system, it's easy to make mistakes, unfortunately. You think it's robust, but it's actually not. And so we're trying to give tools to our customers to figure that out, to measure it, to design it. And we're trying to shrink wrap our services so you can do that out of the box. But I will say that it's a pretty challenging thing to do[...] I think [our customers] can go a long way already, but there's still things where we -- my team and myself -- we're still learning a lot, we're developing new tools, we're improving it. And it's something that, as we're developing that platform, it's an interesting piece: Help our customers develop robust applications. It's not easy at all, and it's easy to pretend that it's robust, but when you put it out there, it's not.

Final Jeopardy

If you'd like to learn more about Watson, the best place to start is with the numerous articles available right here at ExtremeTech. And once you've done some reading, you can head over to the Watson Developer Cloud website(Opens in a new window), and try out some of the public demos. If you're looking to take advantage of the Watson APIs in your own app or website, you can try out Bluemix(Opens in a new window) for free, and start developing today.

Tagged In

Machine Learning Speech To Text Ibm Watson Text To Speech

More from Extreme

Subscribe Today to get the latest ExtremeTech news delivered right to your inbox.
This newsletter may contain advertising, deals, or affiliate links. Subscribing to a newsletter indicates your consent to our Terms of use(Opens in a new window) and Privacy Policy. You may unsubscribe from the newsletter at any time.
Thanks for Signing Up