Summary
Robust data infrastructure is vital for AI, enabling massive data collection, storage, and processing.
Data centers and cloud services provide scalable, on-demand computing resources for AI.
Data platforms and analytical tools streamline efficient model training and deployment.
Cybersecurity is crucial for protecting AI data in an increasingly threat-prone digital landscape.
Welcome to the second part of my in-depth exploration of the AI value chain which is about data infrastructure.
Data is the lifeblood of AI applications, and robust data infrastructure is essential for collecting, storing, processing, and managing the vast amounts of data required for ML and DL.
This data infrastructure encompasses:
Data Centers & Cloud Service Providers
Data Platforms and Tools
Foundational Models
Cybersecurity
💾Data Centers & Cloud Service Providers
As data required for training increases, companies would need a place to store and process these data. The most cost-effective way is through cloud-computing. It provides scalable, on-demand access to computing resources via internet.
It eliminates the need for significant upfront capital expenditure on hardware such as servers. Companies that require more computing resources can just subscribe with cloud service providers (CSPs).
According to Synergy Research Group, below are the top cloud providers worldwide:

CSPs are the ones constantly looking to purchase more computing resources to cater for their increasing customers’ demand.
Goldman Sachs is forecasting that revenue from cloud computing will hit $2 trillion by 2030. The total addressable market of cloud computing is expected to grow at CAGR of 22% from 2024 to 2030. Approximately 10% to 15% of the revenue is from generative AI.
With the expected increase in cloud computing demand, it also drives the demand growth for data centers. These are facilities that house critical infrastructure required to store, process and manage vast amounts of data generated worldwide.
It has the following main components:
Beyond the tech sector, this AI-driven transformation is creating a significant spillover effect across industries such as power, real estate development, and property management, all benefiting from increased demand for data centers.
Investment Outlook
Mckinsey is projecting the global demand for data center to increase at an annual rate of between 19% to 22% from 2023 to 2029. While the demand is increasing, there is one uncertainty remains.
As mentioned in my previous post, the Trump administration is replacing Biden-era’s U.S. AI Diffusion Policy with a new one. It remains to be seen what this new policy entails and its potential impact on overall market growth.
Any restriction on chip exports could slow down the future development of data centers. At the moment, there are no issues with existing orders since this AI Diffusion Policy has been rescinded.
Regardless, I think the impact would be minimal for those supporting components such as electricity and network connectivity as these are largely subscription-based tied on existing data centers.
Overall, companies within this data center and cloud services segment benefit from highly recurring revenue models, making them well-suited for long-term investment. However, one should remain cautious of ongoing geopolitical uncertainties that may impact future growth.
📊Data Platforms and Tools
These are used to manage and facilitate the effective use of data. Companies such as Snowflake, Databricks, Palantir, etc. provide platforms & tools to facilitate the processing and analysis of data which serves as the foundation for training and deploying AI models.
These platforms and tools has the following benefits other than training AI models:
Help companies consolidate various systems - from point-of-sale (POS) solutions to backend ERP systems - into a single, integrated environment for streamlined data analytics.
Businesses can leverage data to drive strategic decisions. Take TikTok, for example - it harnesses data to recommend videos and advertisements, keeping users engaged on its platform for hours.
Investment Outlook
In my opinion, this segment will have companies with highly recurring and "sticky" business models. Their products often create high switching costs, making it difficult for users to change providers.
Migrating large volumes of data from one platform to another is both time-consuming and costly. Additionally, adopting a new platform comes with a learning curve, further discouraging users from making a change.
The global data platforms and tools market is projected to grow at a compound annual growth rate (CAGR) of approximately 12.5% from 2024 to 2032.
🧠Foundational Models
These are pre-trained AI models which form the basis for a wide range of AI applications across diverse industries. Rather than developing AI from scratch, one could use this foundational model as starting point to develop AI that power new applications more quickly and cost-effectively.
Companies like OpenAI, Google, Microsoft and Amazon are at the forefront of developing these foundational models, which serve as the building blocks for tasks such as natural language processing, computer vision and predictive analytics.
There are not many companies that develop these foundational models or at least within my knowledge. So, it is not worth looking into it further.
🔐Cybersecurity
As AI technologies advance and become more integrated into various sectors, the need for robust cybersecurity measures to protect sensitive or proprietary data and systems become increasingly critical.
Especially when AI models are trained on vast amount of data. Cybersecurity plays a crucial role in protecting these data from tampering.
There are 7 core segments in cybersecurity:
🪪Identity & access management
💻Endpoint security
🛜Network security
📱Application security
🗂️Data security
☁️Cloud security
🛡️Security operations
Each of these segments is interconnected. For example, a user wants to access company’s resources. He uses the company’s laptop to access it by typing in his password (🪪identity & access management).
Then, the laptop connects to the internet via a Virtual Private Network which encrypts the internet connection and hides the IP address. This makes it less susceptible to interception by cyber threats when downloading data (💻Endpoint security & 🛜Network security). I think you got the picture.
I like how Public Comps - website that organizes and visualize SaaS metrics - put in place an overview of cybersecurity value chain, credits to the writer (Eric Flaningam):

Companies like Palo Alto Networks, CrowdStrike, and Fortinet are leading the charge in developing AI-driven cybersecurity solutions that proactively detect and mitigate threats in real-time.
Investment Outlook
Similar to all the above segments, companies within this cybersecurity space often has highly recurring and “sticky” business model. Companies do not often change their cybersecurity providers.
Instead, companies often engage with multiple service providers to diversify their cybersecurity defenses. This approach minimizes the risk of disruption in the event of a cyberattack, ensuring greater resilience against potential threats.
The global cybersecurity market is projected to grow at a compound annual growth rate (CAGR) of approximately 14.3% from 2024 to 2032. This growth is driven by the increasing number of cyber-attacks, the proliferation of e-commerce platforms, the emergence of smart devices, and the deployment of cloud technologies.
Disclaimer:
The information provided in this blog post is for informational purposes only and should NOT be construed as financial advice. Investing in stocks and ETFs involves risk, and there is no guarantee of profits. Past performance is not indicative of future results. It is important to conduct thorough research or consult with a qualified financial advisor before making any investment decisions. The author is NOT a financial advisor and is sharing his personal experiences and opinions only.
Additionally, please note that the author holds a position in the discussed stock, and his view may be biased as a result.