Reflection: HOW I think about AI

AI seems to be such a cool thing that big-name companies pay lots of money and resources to it. It seems that who leads the AI research can have the power to control the world. However, personally speaking, I have a rather pessimistic opinion about AI, including ML, DL, RL, and their applications.

Is AI a New Thing?

NO! AI is an old thing. It is firstly developed in the 1950s and reached its first peak in the 1990s. After that, AI research is vanishing because of computational limitations. To further develop AI and ML models, the memory is not enough, and the computation power is much weaker than nowadays.

The second bloom of AI starts with Big Data’s bloom, where large memory and disks are developed. Companies like Google have collected TBs of user data, and they are eager to do data mining to create more business value for their products. At this time, distributed computing also develops fast, and frameworks like Hadoop and Spark are created to fully utilize those data.

Then, GPU and Deep Learning come. Initially, GPU is a parallel-computing device for graphics. It has the feature to parallelly compute and draw pixels on your screen, so fantastic games rely on excellent GPU to run. In around 2011, creative researchers tried to use GPU to train Neural networks, and they made a great success: The computation speed was much improved than before. Soon, all the researchers adopt GPU and even TPU (a computation unit for tensors) to research Deep Learning, and records are renewed every day. The success of Deep Learning does not only make AI revive but also make Nvidia great.

The Fact of AI In 2021

Now, most CS researchers are working in AI; however, no more major breakthroughs like training DL using GPU are found. That’s to say, it is extremely competitive (卷) and also highly risky to start your AI research in 2021.

AI in Research Area

Personally speaking, the number of AI researchers and their research potentials are highly unbalanced. If you start AI research around 2013, you can become a professor without too much competition because you get the ticker in an early stage. However, if you start it now, likely, you cannot find a Ph. D. position when you graduate from your undergraduate school.

AI is experiencing inflation. Most AI conferences accept thousands of papers each year, and the acceptance rate is as high as 20%, meaning that new knowledge is found rapidly. However, it may not represent all papers are valuable, and even part of them are invaluable. Can you even read and digest 100 articles of them each year? Everyone tries to get one’s own profit from the competition, and that’s why the conference capacity is bigger while the quality may not improve each year.

Another terrible story is that, to get a Ph. D. ticket in AI, you need lots of first-author papers on top conferences. The fundamental knowledge of AI requires only essential Calculus, Linear Regression, and Probability, so every first-year student can quickly enter this area. Moreover, the length of AI study is usually as short as 3 months, making it feasible to publish many papers each year. So if you haven’t started doing AI research from your first year, how much do you think you can beat your competitors?

Although AI has more funding nowadays, the increase is still unmatched with the rise of AI researchers. As more and more people flooding into AI research, the average funding for each one will decrease, meaning that the more you work may lead to the less you gain as time flies.

So what would happen if no more breakthroughs in AI? Most papers will become rubbish, and most ordinary researchers will lose their jobs. Only the extraordinary ones will stay in this area, and the number should match with the expected level of all research areas.

AI in Business

The AI’s business model is really confusing. And the reality is that most AI companies cannot make a profit now, especially software AI companies. I will ignore the hardware ones here because they are beyond my scope of knowledge.

How can you make AI profitable? Now, most major AI companies publish valuable papers on top conferences every year and gain lots of investment for their research. However, one critical problem is how to make algorithms profitable? We all know Internet companies sell services and we enjoy them. Is the algorithm a sellable product?

I don’t think so. If you want a high reputation in the research area, you should make your codes open-source so that everyone else can verify your correctness. With the open-source codes, even undergraduate students can build a management system to sell their algorithm. Moreover, since the marginal cost for algorithms is almost zero, no one would like to pay more only because you increase the correctness by 0.1%. Businessmen prefer the cheapest solution within the affordable error rate.

Consequently, companies that sell CV (mostly face recognition) algorithms do not make any profit. If they would like to do B2B business, their customers can easily build one CV system with their engineers and the codes on GitHub. If they would like to do B2C business, who would like to pay for such algorithms as the customer?

The key to being successful in China is to be the monopoly! Being the monopoly forces everyone to use your products, even though your products do not have a technical barrier. Unfortunately, AI software can never make monopoly companies, significantly when too many people flood in. Frankly speaking, as long as one gets a small amount of funding, a new Ph. D. graduate can make one’s own AI startup, and those Internet monopolies can easily copy your ideas and beat you without effort.

How about those big names? Can you find a good job for them? To answer this question, we have to understand why big-names would like to pay for AI to improve their products’ profits. For example, if your face recognition can make transactions increase by 10%, your boss will be happy to fund you on your project. However, if your face recognition can make the correctness rate from 80% to 81%, your boss may need to think twice or move you to the software engineering department that creates value directly.

Therefore, I believe that AI (software) is not suitable for startups. Open-source, the key to AI research, also kills AI’s business model. Our society is still waiting for a business genius to make AI software profitable, and I think this guy may not have a solid tech background.

Algorithm and Data, Which is More Critical?

Algorithm and data, which do you think is more critical in an AI model? My answer is data.

Firstly, with low-quality data, it is definitely impossible to make a precise model. For example, if 50% of the original data is wrong, how can you make an 80% accurate model? On the other hand, if you improve the correctness of data from 50% to 90%, you may find it easy to train a model with 80% precision. Therefore, the quality of data determines the lower bound of your AI model.

Secondly, if we adopt the way to improve AI models by feeding more data, the performance is proportional to the cost for collecting data. If you pay twice for the data, you can always get twice the amounts of the data with the same quality, and certainly, more data can improve your model. On the contrary, if we give you one more month to test your model’s hyperparameters, you may still fail to improve your model even by 0.1%. Since the cost of adding more data is more manageable, I believe most rational businessmen would prefer it.

Where is Data Comes?

This is another critical question, and the sad story is that it would be harder and more expensive to collect users’ data. Most applications collect user data in the background, and those data will be used for their business models or even sold to other companies. When you are using a free app, it seems that you enjoy their services without any cost, but the reality is that they are enjoying your data for free. The more you use, the more data they collect.

Now, some people have realized the big-names’ misbehaviors and ask for more censorship on data collection. It is good news for users, while it makes the companies annoyed. They cannot get your data easily as before, and the price for collecting user data should definitely increase in the future. Maybe one day, companies will need to pay you for your private data.

Consequently, AI researchers will need more costs to collect data. They may find it difficult for some innovative tasks because of the lack of data. Finally, AI will fail to develop because of the barrier of data collection.

AI+ Makes a New Area

One thing I believe is still promising is AI+, that is, enabling AI in various fields. We use AI as the algorithm to solve optimization problems. However, such research direction requires you a solid background in both areas. One clever example is to use a (linear) regression tree to replace the traditional B+ Tree in the database index. Such a model performs better than the traditional one. It would also be less competitive because the bar for system research is much higher than AI.

In conclusion, research in AI for improving the precision by 0.1% is very dull and competitive; if you are fascinated by AI, you should have another major field and adopt AI as a tool.

Read more about my undergraduate reflections: Reflections.

1 thought on “Reflection: HOW I think about AI

Leave a Reply

Your email address will not be published.