Kosta Derpanis posed this question on Twitter:
Did you know ConvNets were initially patented by AT&T Bell Labs? Source.
Then Yann LeCun, following up a 2019 podcast, replies in an awkward nine part Twitter thread about intentionally violating IP restrictions. Since this thread could disappear any minute, and in the spirit of LeCun’s own violation mindset, I’ve posted it here for analysis/archival sake):
There were two patents on ConvNets: one for ConvNets with strided convolution, and one for ConvNets with separate pooling layers. They were filed in 1989 and 1990 and allowed in 1990 and 1991.
We started working with a development group that built OCR systems from it. Shortly thereafter, AT&T acquired NCR, which was building check imagers/sorters for banks. Images were sent to humans for transcription of the amount. Obviously, they wanted to automate that.
A complete check reading system was eventually built that was reliable enough to be deployed. Commercial deployment in banks started in 1995. The system could read about half the checks (machine printed or handwritten) and sent the other half to human operators.
The first deployment actually took place a year before that in ATM machines for amount verification (first deployed by the Crédit Mutuel de Bretagne in France). Then in 1996, catastrophe strikes: AT&T split itself up into AT&T (services), Lucent (telecom equipment), and NCR.
Our research group stayed with AT&T (wih AT&T Labs-Research), the engineering group went with Lucent, and the product group went with NCR. The lawyers, in their infinite wisdom, assigned the ConvNet patents to NCR, since they were selling products based on them
But no one at NCR had any idea what a ConvNet was! I became a bit depressed: it was essentially forbidden for me to work on my own intellectual production (Loudly crying face). I was promoted to Dept Head had to decide what to do next. This was 1996, when the Internet was taking off.
So I stopped working on ML. Neural nets were becoming unpopular anyways. I started a project on image compression for the Web called DjVu with Léon Bottou. And we wrote papers on all the stuff we did in the early 1990s.
It wasn’t until I left AT&T in early 2002 that I restarted work on ConvNets. I was hoping that no one at NCR would realize they owned the patent on what I was doing. No one did. I popped the champagne when the patents expired in 2007! (Bottle with popping cork Clinking glasses)
Moral of the story: the patent system can be very counterproductive when patents are separated from the people best positioned to build on them.
Patents make sense for certain things, mostly physical things. But almost never make sense for “software”, broadly speaking.
Something sounds very wrong. When AT&T in 1996 spun out NCR as its computer division (and Lucent as its equipment and systems), patents on computer technology were separated from the people best positioned to build on them? Product sounds like exactly the right place for product. And then popping champagne for not being caught when illegally taking IP from a former employer?