Big Data is a term that actually serves as an umbrella term to encompass many other related terms that, despite being important, are not used in everyone’s daily lives.
With that in mind, we created this list with 45 terms related to Big Data for you to expand your knowledge and consult whenever you need.
Did you miss a word? Get in touch! The list below is constantly expanding and we would love to have your help in making it an even more robust and useful source of information.
Big Data : the name given to the large volumes of data available on the internet, which are increasing every second. IDC estimates that by 2025, the volume of data worldwide will reach 175 zettabytes. To put this size into perspective, if you tried to download a 175 zettabyte file, it would take about 1.8 billion years.
Data Mining : is the process of extracting information from large volumes of data in search of patterns and valuable information.
Machine Learning : these are computational algorithms capable
Automatically improving with the appearance through experience and the use of data.
Web crawler : These are digital crawling robots that perform the function of scanning websites or digital databases faster than any human being. They are capable of delivering and updating information with a high level of accuracy in real time.
Web scraper: digital robots capable of collecting more specific data than web crawlers. Similar to web crawlers, they also capture data at high speed and with high levels of accuracy, making the automation of data collection complete and relevant to the purposes of companies.
Price Scraping: is the extraction of data
On product prices from e-commerce websites. It can be in real time using web scrapers.
Captcha : A type of how the writing business developed – a summary of 2023 challenge that is mainly used as an anti-spam tool. You know those tests that appear on some websites asking you to identify vehicles, traffic lights or pedestrian crossings? This is an example of a Captcha.
Proxy : refers to proxy servers, which are in automated data collection to prevent bots from being blocked when requesting information from the websites from which the data will be collected.
Artificial Intelligence (AI) : These are systems capable of imitating human intelligence in performing tasks. AI has a variety of practical uses in everyday life, such as in search engines, online advertising, content recommendation systems, virtual assistants, facial recognition, spam filtering, and autonomous vehicles.
Data Engineering: is the activity that involves, among other things, the collection, translation and validation of data for later analysis.
Data Science : a set of strategies, tools and processes
To obtain accurate and quality insights from Big Data information. With it, companies can identify opportunities more quickly and discover mobile lead talent, in addition to acquiring and retaining more customers, among other advantages.
Data Analytics : is the process of analyzing data in search of useful information for the organization’s objectives. It is an activity that needs to take into account details such as metadata, dependencies between data and relationships between data and the real world.
Data Driven: refers to organizations that routinely rely on data in their decision-making process. Companies that make data-driven decisions have more revenue, better serve customer needs, and become more profitable.
Python : is a high-level programming language released in 1991. It is in the development of web crawlers and also in the creation of AI applications.