Using a Dataset and AI for Affiliate Content

On Twitter? Follow me. 🐥

In this post, I’ll be discussing how you can create a large-scale website in a short amount of time and at a low cost. The key lies in utilizing a comprehensive dataset and leveraging AI-powered generation tools to produce factually accurate articles on a massive scale.

While my perspective primarily revolves around affiliate sites, this approach can be applied to various websites, whether you’re focused on providing information, displaying ads, or even promoting your products or services.

So, let’s dive right in and explore how you can harness the power of OpenAI’s API to access real and reliable information.

How some people use AI writing

It’s interesting to see how many people approach content generation by simply providing a keyword or an outline and expecting the AI to generate something meaningful. While this method isn’t necessarily wrong, it can lead to disastrous outcomes. I’ve personally experienced this approach, and let me share an example from a site I’m openly public about—the energy drink review site.

In the past, I used a bulk publishing spreadsheet that relied on 100% AI-generated content. Unfortunately, it resulted in inaccuracies and inconsistencies.

For instance, the AI would generate reviews with flavors that the energy drinks didn’t actually have or provide conflicting information about the caffeine content. One paragraph might state 160 milligrams of caffeine, but three paragraphs later, it would contradict itself with a claim of only 40 milligrams.

Reflecting on this experience, I realized that relying solely on AI-generated content isn’t the most effective way to create reliable and accurate information (in most cases).

For example, questions like “Can dogs eat almonds?” should be handled cautiously since incorrect information could have serious consequences for our furry friends. On the other hand, topics like “How can you make money from digital art online?” are more subjective and don’t have a definitive right or wrong answer.

In such cases, AI-generated content can offer various perspectives and ideas, leaving it up to the reader to form their opinions and make decisions.

Opinion pieces are another area where AI-generated content can be valuable. Topics like “Why I think everybody should have a side hustle” rely on personal viewpoints and individual experiences, making it impossible to label them as right or wrong.

Similarly, entertainment-focused articles like “15 reasons why your ex hates your guts” amuse and engage the reader. It fulfills its purpose as long as it delivers the intended entertainment value and makes people laugh.

There is a better way

And that is by feeding reliable and accurate data into the AI generation tool of your choice. By providing high-quality and verified information as input, you increase the likelihood of obtaining outputs that are factually correct. This approach requires a strong emphasis on data accuracy and integrity, as the quality of the input directly influences the quality of the generated content.

The non-technical tech stack

So, let me tell you about the non-technical tech stack I’m using. I’m not a developer or a programmer, so I had to piece together a few different tools to make it work. Here’s how it all comes together:

First off, I needed a way to store the information I generate, and I found Google Sheets to be the perfect solution. It’s free, cloud-based, and accessible to anyone. I mean, who wouldn’t take advantage of that, right? I do pay for Excel monthly, but honestly, I’m unsure why.

Now, onto the generation part. Instead of spending days trying to figure out APIs one by one, I opted for a more efficient approach. I purchased the Bulk Publishing Spreadsheet from Arielle Phoenix. It already has OpenAI’s API connected to Google Sheets, which saves me a ton of time and hassle. It does cost around $200, but trust me, it’s worth it.

Finally, I needed a way to upload all the content to my site. That’s where WP All Import comes in. Other options are available, but I chose WP All Import because it suits my needs. I went with the paid version since it offers some extra features, like setting up periodic uploads. With this, I can work solely in Google Sheets, and WP All Import automatically pulls in the newly created content at the specified intervals.

So, in a nutshell, my tech stack includes Google Sheets as the database, the Bulk Publishing Spreadsheet connected to OpenAI’s API, and WP All Import for uploading the content to my site. It’s a straightforward setup that allows me to generate and import content seamlessly without requiring extensive programming knowledge.

Keyword research

Regarding keywords, they play an essential role in SEO and editorial content. Every piece of content needs to perform well because you’re investing significant money in each article.

However, with AI-generated content, you can create 30 articles for as little as 24 cents while maintaining quality. Of course, you still need to go in and edit the content, but the initial output is quite impressive. Based on what I’ve seen, the quality of the AI-generated content from OpenAI’s version 3.0 (or even 4.0) is around three cents per word.

The cost advantage of AI-generated content is significant. Even if only 20% of your content performs well, it doesn’t matter because the overall investment is still relatively low. It’s like conducting a hundred experiments. If some articles hit the mark, great. If not, it’s not a big deal because the cost was minimal.

Let’s consider an example using the niche “microphones” and the Sennheiser e835 microphone. Even though the specific search volume and keyword difficulty numbers mentioned may not be accurate, the idea is that if people are searching for the e835 versus another microphone like the SM58, they are likely to be interested in comparisons with other microphones as well. It’s about understanding the user intent behind the keyword.

With affordable AI-generated content, the search volume becomes less of a determining factor. Even if 80% of your content doesn’t receive many views, it doesn’t matter because the cost per article is minimal. Whether an article gets 2 or 140 monthly views, it still holds value.

Lastly, if you have knowledge of your industry and what people are searching for, you can tailor your spreadsheet and content accordingly. You have access to proprietary information, allowing you to target keywords based on your unique insights. So, while keywords play a role in AI-generated content, they might not have the same level of importance as in editorial SEO articles.

Let’s talk about data sets

When it comes to data sets, there are different levels of effectiveness. The more private your data set is, the better it is for your purposes.

Level 1: Public data sets

Public data sets, like the ones available on platforms like Kaggle or from one manufacturer, can be accessed by anyone.

For my experiment in testing data-fed AI-generated content, I used an affiliate information data set from the manufacturer. I didn’t focus on keyword data or if it would rank. If it ranks well, great. If not, no problem. It was a low-cost experiment. Again, that was to test prompts.

Level 2: Combined data sets

The next level is combining multiple data sets. By merging data sets, you can create unique insights and valuable content. For instance, combining weather data with fly fishing conditions in Colorado Springs allows you to address whether it’s suitable for fly fishing on a specific weekend. This approach takes things further and provides more value to your audience.

Additionally, you can integrate data from an API that provides updated information. By using this data in conjunction with other data sets, whether public or proprietary, you can offer real-time and unique insights that others might not have access to. This gives you a competitive edge and enhances the value of your content.

Level 3: Private data sets

The highest level is working with completely private data sets. This could involve modifying a public data set to make it unique and exclusive to you.

Alternatively, you could create your own data set from scratch. This can be achieved by reaching manufacturers for specific information, scraping websites for data, or consolidating fragmented information from various sources into a single, comprehensive data set.

Having proprietary information no one else has allows you to create truly distinctive and valuable content.

In my journey, I started with public data sets, experimenting, and learning. Then I combined and compiled data sets to provide more unique insights. Now, I’m focused on creating proprietary data sets that are accurate, comprehensive, and highly valuable. This approach enables me to recreate successful sites multiple times, maximizing my earning potential.

Remember: Even if you have a good data set, there’s no guarantee that people are searching for that specific information.

Setting up your spreadsheet

To set up your data set using Google Sheets and the Bulk Publishing Spreadsheet connected to the OpenAI API, you can follow these steps:

  1. Structure your spreadsheet columns: Start by setting up the columns in your spreadsheet. Each row represents a product, and the columns represent different information about the product. For example, you can have columns for product names, features, benefits, pricing, and generative prompts.
  2. Enter product information: Fill in the corresponding cells with the relevant product information. Include details such as weight, color, dimensions, pricing, and any other relevant attributes. These details will serve as the basis for generating content.
  3. Maintain consistent H tags: Ensure that the H tags are consistent across the spreadsheet. This means the H2 headings should follow the same structure for each product. For example, you can have H2 headings like “Features of Product X,” “Benefits of Product X,” and “Cost of Product X.”
  4. Use formulas for content generation: In the cells where you want the content generated, you can use formulas provided in the Bulk Publishing Spreadsheet. These formulas typically involve custom functions that take product information from the corresponding cells and generate written content based on that data.
  5. Customize and refine the content generation process: Adjust the formulas and instructions in the spreadsheet to fine-tune the generated content. You can experiment with different writing styles, language complexity, or additional formatting options like lists or bold text. This allows you to tailor the content to your specific needs.

What’s the cost for a site like this?

For this website experiment, I created 500 affiliate posts and 80 informational posts, the total cost for this project was around $70, excluding hosting, domain, and software expenses. Considering those, we could say the total cost came to approximately $85. Shared hosting, baby!

I asked ChatGPT to generate 80 blog post ideas for the informational posts specifically for the niche.

The main objective of this project was to acquire a useful data set, rather than solely focusing on keyword data. While it would be great if the site ranked well and generated income, that wasn’t the primary goal of this particular experiment.

What’s up next for me

So, I plan to create a private data set containing valuable information about a niche I’m quite familiar with. I can leverage this data to create multiple websites, each from a different point of view.

Let’s take the example of a microphone website (although I’m not actually creating one). I would create websites like, (for live musicians), and (for hip-hop producers).

I aim to include around 20 products per vertical to keep things focused and manageable. This number is based on the popularity of the industry and the key players within it. While there may be other players, I’m initially concentrating on those 20 important ones. Compiling all the data by hand is the process I’ll undertake, working with a virtual assistant to gather and organize the information.

The data points in my data set will vary, some being technical details and others being opinion-based. However, they will all contribute to creating factual content tailored to specific audience needs. With just 20 products, you can generate 20 review posts, a bit less than 200 comparison posts, and a bunch of informational posts.

The format for my posts will follow a consistent structure across the websites. For example, I might have posts like “Is the e835 Good for Podcasting?” or “How to Set Up Your e835 for Podcasting.”

I’ll also create comparison posts where I analyze the numbers and technical aspects of different microphones, allowing readers to make informed decisions based on their specific requirements. I’ll also include alternatives to posts, providing a list of other microphone options with internal links to the full review posts.

The exciting part is that I can repeat this process for different target audiences, such as podcasters, live musicians, bedroom producers, DJs, and YouTubers. Using the same data set, I can create multiple websites that dominate the SERPs, ensuring that I still earn the commission no matter which site users land on. It’s almost like having a private blog network (PBN) strategy focused on SERP dominance rather than link building.

Working with AI-generated content is a game-changer as a solo entrepreneur operating from my living room. It allows me to compete in terms of volume without having to spend thousands of dollars every month. By providing the AI with accurate information and guidance, I can ensure that the generated content meets high standards rather than being wild and unpredictable.

Leave a Comment