NFT Basics: Collecting Web Data for Your GAN Project

By June 21, 2022NFT
Click here to view original web page at
(Image Licensed from Shutterstock)

When most people think of Non-fungible Tokens (NFTs), they usually think of artwork, which has become a popular use case for NFT technology. To put it another way, NFT is a blockchain-based system that allows individuals or organizations to claim ownership of any digital item, including music, code, and art.

Some people may be familiar with The Bored Ape Yacht Club, a project that produced 10,000 pieces of unique digital art, and others with CryptoPunks, a popular NFT art collection by larva labs.

(CryptoPunks sample Licensed courtesy of Shutterstock)

Bored Apes may appear to be nothing more than a joke, but in real economic terms, the entire collection is worth more than $1 billion — with some individual apes selling for more than $2 million.

The narrative does not stop there, however. A group of individuals exposed to the Bored Ape project decided to apply Generative Adversarial Networks (GANs) technology to make their own spin on it.

A GAN is utilized in computer programming to create/generate unsupervised output by humans. As a consequence of this, the GAN Bored Ape series was created.

Source: @boredapebot

The role of data in creating a GAN

A GAN is constructed of a “Generator” and a “Discriminator.” The Generator generates new data, such as pictures or text, based on the training data supplied to the algorithm. The Discriminator is responsible for telling the difference between the newly manufactured datasets and the real-world data. The Discriminator only recognizes datasets that are highly comparable to real-world data. This implies that when attempting to build or train a GAN that will be able to generate high-value output, it’s essential to get accurate, high-quality data.

How to collect web data for your GAN project

If you’re looking to get started with GANs, web data is a great resource as it is scalable. There are many ways to collect web data, but we’ll focus on two particular methods: web data scraping tools and web APIs.

Web Data Scraping Tools

There are several kinds of web data scraping tools accessible. Some, like browser scraping add-ons, are simple to use and suitable for small-scale data collection, while others need significant technical knowledge and programming skills — such as Python BeautifulSoup and Pandas libraries.

Some websites have complex structures that need a sophisticated script to interpret them. Others may make heavy use of JavaScript, making data collection difficult unless you are using an automated browser like Selenium or Puppeteer. Many websites, such as eCommerce websites and social media platforms, are well protected against data scraping, making it more difficult, and requiring more advanced data scraping solutions that incorporate rotating proxies and user agents.

Some of the more popular web scraping platforms you can subscribe to are:

  1. Data Collector
  2. Scraping bee
  3. Parsehub
  4. Phantombuster
  5. Apify
(Data Collector IDE — Image courtesy of Bright Data)

Whichever solution you choose, whether hiring a web scraping freelancer or subscribing to a SaaS web data collection platform, make sure to request a POC (proof of concept) and a live demo to ensure you can successfully collect the public web data from your target resources.

If you are worried about the legality of web scraping, fear not — in September of 2019, in an unprecedented decision, a US court has validated that collecting publicly available data from the web is irrefutably legal.

Once you’ve collected the data, you can use it to train your GAN.

Website APIs, on the other hand, provide programmatic access to web-based data. Many sites offer APIs that allow you to request data in a specific format, making it easy to use in your GAN training. One popular API is the IMDB API which allows you to collect data from IMDB movie and TV titles for your project. However, keep in mind many of these public APIs are limited, so if you want to speed up the process you may choose the web scraping path.

What other benefits do GANs hold for businesses?

Assuming GAN is an isolated development relevant only to those involved in the blockchain, crypto, and NFT art trading market is a mistake.

Artificial Intelligence Machine Learning GAN technology has the potential to revolutionize business ideation processes, from whole business ideas down to products and code lines.

You may already start to see cases that are beginning to hint at business capabilities that, when properly implemented, might have a wide impact on entire industries such as music, film, graphics, and more.

Since their debut, GANs have evolved significantly. When they were first developed, they were only capable of producing numbers and images that could be recognized by the human eye.

Source: @towardsdatascience

Since then, GAN Machine Learning methods have advanced significantly. Even so, it has yet to be perfected. Companies may delegate their DevOps teams the job of ingesting open-source web data and building unique GAN models in their respective industries if they choose to become early adopters. Players like these have a chance to become industry leaders, not only by coming up with concepts that no human mind has ever imagined but also by going against the grain by utilizing cutting-edge technology.

Also, Read

All Today's Crypto News In One Place