Getting Started with Machine Learning in SEO with Lazarina Stoy
March 29, 2022 |
The In Search SEO Podcast
The In Search SEO Podcast
How much of your SEO are you automating? And did you know that automating what you’re currently doing could help you do it faster and with a greater level of accuracy?
Today, we’re discussing the six steps to getting started with machine learning in SEO, with an SEO manager specializing in all things technical and data. She’s also a content creator, sharing Data Studio dashboards, scripts, and other useful tools, helping and aspiring other SEOs to get through tasks a little more efficiently. Welcome to the SEO and data science manager at Intrepid Digital, Lazarina Stoy.
The steps are:
- Understand Your Limiting Beliefs and Overcome Them
- Understand Common Task Specifications, Solution Specifications, and Data Specifications in Machine Learning
- Practice Daily and Start Going Through the Motions
- Asses New Tasks, Their Solutions, and Data Characteristics to Understand When Machine Learning is Needed
- Understand the Limitations and Scrutinize the Output of Machine Learning
- Work Collaboratively and Set Reasonable Expectations
SEO Machine Learning
Lazarina: Hello, there. Happy to be here. Hello.
D: Happy to have you here. You can find Lazarina over at lazarinastoy.com. So Lazarina, how long until the machines take over SEO?
L: I hope in a long time. We need to take over the machine so we can be more efficient. They’re not taking over us I hope.
D: Maybe a neural link to the brain will help.
L: Yeah, we’re going to be way better than them. We just need to use them properly. Channel them where we need them.
D: There’s some positivity for you. So today we’re discussing the six steps to getting started with machine learning in SEO. Starting off with number one, understand your limiting beliefs and overcome them.
1. Understand Your Limiting Beliefs and Overcome Them
L: This little step is about actually thinking about what’s stopping you from actually pursuing machine learning a little bit more. What’s stopping you from getting started with SEO automation and doing things that are the fancy new scripts and tools and things like that. Because oftentimes, I hear a lot of people really inspired by new technology, and they want to try it. But they have these limiting beliefs that are stopping them from doing so. And it’s not something that is unique or specific to the SEO industry. Actually, this is something that has been very widely recognized in the machine learning community as well, because it’s a problem that a lot of the developers actually have, as they are not themselves starting with machine learning.
Actually, a data scientist that’s very famous in the machine learning community, Jason Brownlee, has actually made a list of some limiting beliefs, or some reasons why you’re not getting started. And if I have to put them in statements, it’s often the things that we say to ourselves, like that you have to be a Python expert to start or a coding expert to start, or you have to know the machine learning field from A to Z. Or what each algorithm does in order to get started. Or maybe you have to have a lot of free time to start or you have to have some time in your schedule, or the perfect PC or whatever else you may be telling yourself. The most difficult one to overcome is that you think it’s very difficult or challenging to get started. While most machine learning experts are actually going to tell you that executing a machine learning model literally takes three lines of code. It’s about understanding your data, and knowing where and when you can apply it. That’s the difficult part.
So overcoming these limiting beliefs is the first step because actually, it’s not hard to get started. You can literally Google machine learning and 10-minute tutorials and start small and build a habit every single day. And that way, if you’re just waiting to get started and you’re feeling some constraints mentally while you’ve not started, you can see that it’s a very small step that you need to do. You just have to book 10 minutes of your schedule and get started. And once you execute your first script or install, the libraries and everything, then you’re going to see that it’s actually not that challenging. And then the game shifts a little bit, then you just have to see where you can apply the models to the day-to-day life of an SEO. And that’s the fun part to be honest.
D: Understood. Essentially, what you’re saying is don’t let technology stop you from doing things. You don’t have to understand all aspects of the technology before you actually get started.
L: It’s a lot like in the field of SEO where you don’t have to understand everything in order to get your foot through the door. You just have to have the passion and the desire to do it.
D: I think the challenge is that many SEOs are of the mindset that they want to understand everything before doing something. I think they’ve got that kind of brain where they need to understand the why before doing something.
Number two is to understand common task specifications, solution specifications, and data specifications in machine learning.
2. Understand Common Task Specifications, Solution Specifications, and Data Specifications in Machine Learning
L: Yeah, I’m going to start with the data first. You need to know when you’re searching for a specific… Let’s say, you’ve already passed step one, you have some daily practice, maybe 10-minute tutorials, you see this is super cool and fancy, and you’ve been doing it for a while. And now you kind of encounter a task in your day-to-day life. And you want to see whether machine learning is the correct solution to help you overcome some challenges that you’re facing. The three things that you need to think about are the data characteristics that you have. I.e., the data set that you’re going to apply machine learning to. And that can be textual, numeric data, or it can also be image-based data. But we’re talking about the beginner scenario here, of course, you have other things like multimodal machine learning where you apply video or audio files, and you apply machine learning to them. But we’re just talking about the beginner’s case here. And most of the time, as SEOs, the tasks that we give the model are going to be text-based, so for instance, the content on the page, or they’re going to be numeric. So for instance, if you’re trying to predict organic traffic, or you’re trying to predict the number of clicks that you’re going to get, and things like that.
And when it comes to task characteristics, we know that there are two main fields in machine learning, supervised and unsupervised. And we need to know the main models that you can, in the most common case, use for these specific types of tasks. So for us, supervised learning means that you have labeled data to validate the output of the model. And unsupervised means the opposite, you don’t have a way to validate the results. In supervised learning, you have things like regression, which is about making predictions, or classification, which means splitting into groups based on existing classes.
Just to give you an example of both of these things, making predictions we already discussed, like predicting organic traffic or things like that, is a good case, especially when you’re working with big data, or classification splitting into groups. For instance, if you have a part of your blog that you already have categorized with specific categories, and you have new content that you want to classify into one of these specific groups, that’s where you can use classification to help you. In supervised learning, you have clustering dimensionality reduction, which are a little bit more advanced, that you’re probably going to use, like when you have a very large set of data, or when you have a problem that you don’t really have a way to validate. And oftentimes, you might be able to combine the two approaches as well.
When it comes to the solution characteristics, the most important thing to know is when and where you should be applying machine learning. On my website and a blog, for Beginner’s Guide to Machine Learning, I’ve actually listed a couple of flowcharts that can help you to go through the process of how you are applying machine learning and whether that’s the correct use case for it. For instance, if you’re not working with big data, then it is not machine learning, it can literally be done in a spreadsheet and or a Google sheet with a couple of statistical formulas. Because essentially, at its core, most machine learning models are exactly that. Statistics. And yeah, if it’s mission-critical, like the tasks that you’re doing, you shouldn’t be relying on machine learning at all. And if you need to relate to your stakeholders, the way that the results have been achieved, or maybe try to explain or replicate that model’s output, there are certain models that you should be avoiding, for instance, supervised machine learning, or deep learning, because most of them work kind of like a black box. And it’s quite difficult, especially with big data to replicate what the model has done in order to reach the output.
A lot of food for thought here. But I just want to say that if you understand these three things, and you’re able to say for your specific problem, “Okay, my data is textual. The model that I need is regression-based or maybe classification-based or whatever it is like so you need to pinpoint what your data is, what your task is. And you need to understand what type of solution you are going to be looking for. And if you can do that, then it’s going to be a lot easier to find the appropriate resources to help you with your goal.
D: And number three is to enable daily practice and start going through the motions.
3. Practice Daily and Start Going Through the Motions
D: Which brings us up to number four, when encountering new tasks, assess the task solution and data characteristics to understand whether machine learning is really needed at all.
4. Asses New Tasks, Their Solutions, and Data Characteristics to Understand When Machine Learning is Needed
L: Yeah, again, something I touched upon before. There are different ways that you can assess whether machine learning is needed. And I already mentioned those couple of flowcharts. But essentially, every single task that you encounter, and let’s take, for example, writing meta descriptions, because that’s something that we do quite often. Well, we shouldn’t be spending too much time on it. We know that they’re not very important for SEO, but as part of an on-page optimization project, that’s something that you should do, you should optimize them.
Let’s deconstruct a task here. If you say that your input data is textual, that means that yes, the page content is textual. And what is the task in that case? Is it supervised or unsupervised? We know that it’s unsupervised. Because there’s no way for us to validate the results of the output, we have to do it ourselves, we don’t have an automated way to do it. In that case, we are going to be looking for a model that is transformational, for instance, taking the text of the page, and transforming it to less than 160 characters output. So taking sentences from the text means extraction, or it can also be summarization, as well. But another way to do this is to use a generative model like GPT-3. We give it the input, i.e., the text on the page, and it generates the meta descriptions and writes them from scratch, essentially.
Reverting back to whether it’s mission-critical, we know that it’s not. So machine learning is good for this type of exercise. Is it okay that sometimes, when you run this model for the same type of page, you might get different outputs? For instance, if you run the model twice, you might get two different meta descriptions for the same page. That’s absolutely okay. We can choose from both of them. That’s not a problem at all. Do we need an explanation of how we have written these meta descriptions? No, not at all. And we wouldn’t need to be explaining to our stakeholders how we have done this. Does it outperform average methods? A very important question when you want to assess whether you want to use machine learning, and we can say yes, absolutely, because it’s much faster if you take this framework and apply it to the tasks that you’re wondering whether you should be using machine learning for them or not, then you might find very quickly which tasks are suitable for machine learning and which tasks are not. Just to give you an example, that might also depend on the niche as well, because if we take the absolute same example, for titles and H1s, we might say that for a niche that is very non-competitive, not important in terms of its not Your Money or Your Life type of thing, then we might say that absolutely, the same stance, it is not mission-critical. The tasks can be automated, we can implement machine learning, and we don’t need to give explanations on how we have written the titles and the H1s. But if we say, for instance, our client is HMRC, we know that this is very important to get this right. We don’t want to be suggesting some titles or H1s or meta descriptions that are not completely on point. So you might also feel sometimes that your client or the industry that you work in, or the particular site that you’re working on, might be the reason why you cannot implement some of these tools.
D: I think that leads us quite nicely up to number five, which is when working with machine learning, understanding its limitations, and scrutinizing the output.
5. Understand the Limitations and Scrutinize the Output of Machine Learning
L: Absolutely and this one is very important because of the number of times that I’ve thought that something can be done with machine learning, and I’ve implemented a model or tested a script or anything like that because there are so many scripts out there. Honestly, if you know how to search for them, there’s almost a script for any task that you can think of, you just have to know how to implement it to the particular problem in SEO. And the amount of times I’ve tested some and ended up not using the output is a lot. So you need to know what machine learning can do. Right now machine learning is at a stage where it’s very good at narrow tasks, but most of the models, the way that they are trained, and as a beginner, you are going to be most of the time using pre-trained models, you’re not going to be training them yourself. And that’s a whole other topic on its own. But if you are using pre-trained models, and you’re not training them yourself, most of the time you’re going to see that the data set that they have been trained on is not particularly useful to the particular industry, or it’s not as in-depth as you might want it to be. And that’s particularly the case for text-based tasks like NLP and things like that.
And here, you just have to know two things. First of all, even having a foundation to work with is a good enough thing to have. If you know that, yes, for instance, the meta descriptions that the model has generated are not as good but they can be fixed. And if you think that, if you edit them, it’s going to be a lot quicker for you to generate the final output, then you should pat yourself on the back, you’ve done a great job because you have saved yourself a ton of time, even with that. And if you think that the output is just not useful at all, that doesn’t mean that all machine learning is not useful. That just means that for this specific task, this specific model that you’ve used has not been useful. And that’s perfectly okay. Because you can even say to your stakeholder, or your client, that we tested a few approaches, we tested an automated approach. And it didn’t work because of this and this reason. And then you can even use the model’s output in order to compare and contrast with the output that your team has generated. So in all cases, implementing or trying machine learning is going to make your case a lot stronger. You just have to know when to say okay, we tried that, but it didn’t work and how to use this to your advantage as well.
D: And talking about outputs that take us neatly to number six, which is to work collaboratively and set reasonable expectations.
6. Work Collaboratively and Set Reasonable Expectations
L: This last step is all about knowing when you need help and knowing how to get the right help. And here I think there are a few things that you can pursue in order to get you the help that you need. First of all, find a machine learning buddy. Someone that is the same type of person as you. They work in the same industry. They have the same type of problems. And you’re thinking and encountering problems and you’re researching for tasks kind of together. And when someone finds something that’s useful for the role, then they share it with their buddy. It helps you keep yourself accountable, it helps you keep yourself motivated and it genuinely is a very good thing to have.
Another solution here would be to join a tribe like a machine learning tribe. Again, I’m going to call Jason Brownlee here. He has created this chart, a breakdown of the different machine learning tribes that are out there. And for us, as SEOs, we belong to either a business tribe, so those might be managers that are trying to see whether automation or machine learning is the correct solution for their problem, or whether it can be used for their team, or data tribes, which are, for instance, data analysts that are trying to be a little bit better at understanding data. And the reason why I mentioned these two types of tribes specifically, is that if you are in a group or community like that, you know that the people there are trying to find the same type of results, but their approach to the problem is similar to yours. So you’re not going to be scrutinized, for instance, if you don’t know the advanced concepts related to coding or the math behind the machine learning model. You’re going to be surrounded by people that are very sympathetic to the type of challenges that you have. And that might be different challenges. For instance, if you are in a community with Python developers, and you’re asking, why is my model not working, and it turns out to be a comma or something like that. You might feel a little bit more scrutiny, the responses to your query might be a little bit harsher, and we don’t want you to feel unmotivated. So that’s why I emphasize finding the right type of community or the right type of tribe.
And the third thing is, and I think this is something that you should apply, regardless of what approach you choose. That should be just reaching out to developers whenever you can, whenever you feel like you cannot solve the query on your own. If you have spent like six hours on StackOverflow, and you’re stuck on just one error, and you’re feeling super unmotivated, just reach out to someone, there are so many super skilled Python developers, even in the SEO community. If you reach out to them, they might have already encountered a problem like that so they might give you directions. And I’m not saying reach out to them, so they can write your script for you. But they can just guide you in the right direction. And that can be very inspiring and motivating, as well.
D: Lazarina, that was absolutely brilliant. I know that you can talk for weeks about machine learning for SEO. That was just a brief introduction. But it really demonstrates your knowledge. And I’m sure you can dive into specific areas. It’d be great to get you back at some point, maybe we can get you back and dive into something like Python for SEO or some other more niche topic there.
The Pareto Pickle – Analyze the SERP for New Opportunities
Let’s finish off with the Pareto Pickle. Pareto says you can get 80% of your results from 20% of your efforts. What’s one SEO activity that you would recommend that provides incredible results for modest levels of effort?
L: I’ve thought about this a lot. And I wanted to say something that I think I haven’t seen done that often. And for me, it’s SERP analysis. We have a lot of opportunities to analyze search engine results pages, especially at scale. So if you have already done your keyword research, and you know the keywords where maybe your site is ranking, the ones that the site is not ranking, and the content gap and everything. I would say that if you can do a very wide-scale SERP analysis, using a tool like DataForSEO, that way, you can get a very good picture of the market. Like who is ranking where, what brands are you competing with, how are they structuring their titles, meta descriptions, how long their content is, and all sorts of analysis like that. And if at that level, you know when and where to implement machine learning, then that activity alone is going to influence your entire content strategy. And it’s going to keep you on track in order for you to remain competitive later on as well. So it’s something that you do once for this specific client and protect and you can use this for a couple of months down the line to guide you and influence you in many of the other strategies that you’re going to do.
D: I really want to dive into that. I want to keep on going and ask you more questions. But I know that that’s going to take an extra half hour or so and we don’t have the time to do that just now. Hopefully, we’ll get you back in a future episode. For now. I’ve been your host David Bain. Thank you so much for being on the In Search SEO podcast.
L: Thank you so much for having me, David. It’s been a pleasure.
D: And thank you for listening.