A Scientist’s Guide to Growth Marketing

Written by:

Alex Steeno

Advisor @ Galactic Fed

Published July 15, 2020

I always thought I was going to be a full-time scientist. I was right, but with my background in chemistry, I’m not doing the kind of scientific exploration that I expected. I thought I would spend all of my days in a laboratory carefully repeating and iterating on experiments until I achieved the desired result with optimal significance.

I’m still a scientist – but the shape this has taken makes me a modern growth marketer. These are the newest scientists born from the digital age in the world of performance marketing. We are the breed that serves as end-to-end idea factories constantly testing these new ideas and stacking on successful experiments to turn companies into rocket ships.

While I learned chemistry, experimentation and the scientific method, that’s not what I ended up doing. Fortunately, I realized that learning hard science taught me a problem solving paradigm. It taught me exactly how to frame problems, glean meaningful insights from data, and use this in iterative feedback with the scientific method loop to design better (and more successful) experiments.

As an academically trained chemist, this is how I think about growth marketing. Let’s dig in.

What Variables Can I Test?

The short answer: mostly everything. No matter what your business model is, you have a funnel, you have acquisition channels, you have customer life cycle stages and I’m sure you want to improve all of these. Testing is going to be your medium to do so.

No matter what customer life cycle a prospect or customer is at, there are a myriad of tests you can run. I’m going to give you a framework to build this out or improve upon it within your organization. You’re probably already doing testing of some sort but I’m going to give you the resources and tools to scale this, make sure you’re doing it well, and provide systems to make the process as fluid as possible.

Here are some things you can test:

Ad Copy: what verbiage converts the best?
Audiences: how can you adjust your targeting to maximize customer acquisition?
Landing Pages: how might we design/iterate on a fast-loading, well-converting landing page?
Email Marketing: how can I best engage my prospects to drive funnel through-put?
Sales Process: How can I alter my sales teams’ playbooks to maximize opportunities and deals?
Retention and Evangelism: How can I best equip my customers to drive referrals and talk about our product/service?

This should be enough to get the wheels turning in your head. There are a ton of things to test and when you start best understanding how to isolate variables, ask the right questions and formulate good hypotheses, you’ll have an endless list of tests to run.

Action Step: Start thinking about what steps in your funnel could be improved. Use your gut. You know your customers and the market. Where do you believe you could make improvements either in bolstering funnel stages or reducing fallout?

The Scientific Method

The scientific method has 5 major parts which we’ll discuss briefly in the following section but expound on certain parts later in the article. Here are the steps:

Formulate a Question

This could be anything related to a business goal. A good example could be “How might we increase conversion rate on landing pages?” These questions can be as general or as specific as you like, just as long as they can be measured in a meaningful way.

Action Step: Generate your first “How might we….” question. This could be anything from “How might we reduce fallout at some life cycle stage” or “How might we drive more organic search traffic to our website?”

Construct a Hypothesis

Next, we’ll want to come up with a hypothesis, or a potential answer, regarding our question. This hypothesis should be informed in some way whether that’s internally or from external sources. For example, if your question is “How might I reduce fallout from prospects in the Sales Qualified Leads (SQL) to Deal life cycle stages.” Because your sales team is speaking with these prospects, it’s likely that you have some solid data on why this could be (whether it’s anecdotal or formally tracked). Thus, the hypothesis is then used to come up with solutions to fallout reasons that are already known. If this isn’t already captured, this is a great opportunity to start capturing data which could fuel future experiments (more on this in “Laying the Foundation with Data”).

A good hypothesis should follow the general form: “If {variant experience} occurs, then {metric} will {decrease/increase} by {goal}.”

Note that we are specifying what the change will be in {variant experience}, call out exactly what KPI we will be measuring using {metric} and setting a {goal} value within the hypothesis. The {goal} variable should account for statistical significance along with an incremental improvement such that it makes sense to invest the resources to scale the change across the whole customer base.

Here’s an example of a good hypothesis: “If we change the on-page CTA from ‘Learn More’ to ‘Get Started’, then Visit <> Lead conversion rate will increase by 10%.”

Note that sometimes in some cases, we’ll also refer to this {goal} variable as the Minimum Detectable Effect (MDE) which helps us determine the statistical significance of our test. More on this in the following section.

Lastly, this structure is not necessarily an end-all, be-all solution to generating good hypotheses but is designed to give you a good start.

Action Step: Using your “How might we…” question, create a hypothesis using the above structure/organization.

Structure a Test

The next step is to figure out how to test this on a subset of your prospects or customers in such a way that we can mitigate risk if our hypothesis is dead wrong. Generally, this is done by testing it on anywhere from 3-20% of traffic as the experimental group. The remainder of the traffic is referred to as the “control group” or the group that is completely isolated from the change in experience. In science, we refer to the ‘variant experience’ as the independent variable and our goal in this phase is to measure the effects of the independent variable.

In order to determine what percentage of your traffic to use, you’ll want to take into account a couple major factors. The first is risk. If you are completely off-base in your hypothesis (which will happen), how much potential revenue are you willing to sacrifice? This is something we’ll obviously want to minimize.

The second factor is statistical significance, playing into MDE which we mentioned earlier. Do some quick back-of-the-napkin math to determine how big of test buckets you’ll need to make it worth it. I have this calculator bookmarked for my quick statistical significance checks.

And finally, the third factor to consider is time-frame. You’ll want to marry this with the first two factors in order to determine how many days/weeks/months it will take for you to hit statistical significance in your experimental group. From here, play with the numbers to see what makes sense for the test. There is another really important factor in this process which will be covered in greater detail later, Leading & Lagging Indicators but this is good for now.

The actual tools used to test these variables can differ depending on what type of test you’re running. Here are a few examples:

On-site Copy/elements – I’d recommend using Google Optimize or Optimizely. These tools will generally cover on-site elements well.
Landing Pages – If you want to rapidly deploy landing pages to test conversion, I would recommend using a tool like Unbounce where you can rapidly create and deploy similar pages.
Customer Experience – For general customer-centric experiences (like changes in sales process or account management), I would recommend using a tag or additional variable inside your CRM. For example, in Hubspot, you can create a list where the customer ID’s end in “2” to create a 10% subset of your customers. Add in additional numbers for higher percentages.

Action step: Think about your business model, how you acquire customers and how you nurture leads. What platforms are you using that they engage with? Using the list above, starting exploring what tools will work best for you.

Analysis and Reporting

Now that you have good to go and have pulled the trigger on the test, it’s time to check in on those results regularly. I would suggest making it a habit of checking on your ongoing test first thing in the morning each day. That way, if something needs immediate attention or you really have something going down in flames, you’re able to take immediate action.

The most important thing to communicate in this section is not to celebrate (or freak out) too early. Your results are going to be all over the board at first but this is just normal chaos. Until you start getting reasonable sample sizes on the experiment side, you might see extreme outcomes, both positive and negative. This is why monitoring statistical significance is so important. Keep the test going until you see a meaningful significance level, generally 95% or more but you’ll likely have many hundreds of samples or more at this point.

Implement or Refine

Once you have the results in hand, you’ll want to go through the data to decide on the next step. Hopefully, within a reasonable period of time, you have a clear winner between your experimental or control group. Because I want you to win, I hope that your experiment won and you have a clear business case for implementing the experience across the entire organization. If it’s a small copy/button change, that’s amazing because it’ll be easy to implement but if it’s a new sales process or conversion funnel, I could write an entire extra piece on change management within organizations. The important part is that you have clear data proving that your experiment yielded a measured, controlled and positive effect on the experience and therefore the revenue impact can be quantified.

It’s important to note that independent of the outcome of your test, you can use the results to further refine your existing hypothesis, document learnings and even use the data to fuel additional tests. In our example of “If we change the on-page CTA from ‘Learn More’ to ‘Get Started’, then Visit <> Lead conversion rate will increase by 10%”, one potential result is that the button gets a lot more clicks but that doesn’t translate to more conversions. We can use this data to ask better questions “Why don’t people convert as a lead even though they are clicking the button?” One potential idea could be that they don’t trust us enough or we may lack the social proof necessary to get them to submit their information. Thus, we could test adding in our current clients, providing a secondary link to a case study, telling a short snippet of our story on the page, etc., the list goes on. Get in the habit of constantly digging deeper and asking better, more specific questions. This will make structuring and executing tests much more seamless, mechanical and hopefully, with a higher rate of success.

Laying the Foundation with Data

Arguably, the most important aspect of your testing and experimentation infrastructure is going to be the underlying data that can support this. The vast majority, if not at all, of your hypotheses will be both generated and supported by some data you have in-house. Thankfully, in most modern systems, the data that you need is already being captured. Google Analytics, Hubspot, your web-app’s database, etc. are already capturing the vast majority of the data you need to make decisions. However, one area of data that can constantly be improved is capturing reasons for fallout or missed opportunities.

Imagine you’re a marketing agency and you’ve just had a sales call with a lead. This is either going to progress to becoming an opportunity/deal or it’s going to turn out to be a bad lead. Capturing this bad lead through a “reason code” is going to provide actionable data to support further tests to reduce fallout. The first step is to implement a system that triggers when your sales team marks a lead as “bad”. Most CRMs like Hubspot, already have the capability of doing this and you can customize the reason codes. You can create a new field, set it as a drop-down menu and trigger a workflow to create a task assigned to the salesperson to provide feedback on the bad lead reason. Over time, you’ll generate a healthy amount of data from which you can generate educated hypotheses on minimizing this fallout.

Action Step: Think about what you currently measure when it comes to fallout. Are there any areas for improvement there? How might you start capturing actionable data that you might be missing out on?

Asking the Right Questions

A good hypothesis can only be derived from asking the right questions. The right questions all have similar form with the most important aspect of which being the open-endedness. They are never leading questions and always contain one KPI that we are seeking to improve along with any level of audience-specificity that can be supplied. This could be demographics, firmographics or even a marketing channel. The framework I use to approach this follows this structure:

Ask a meaningful question: How might we improve onsite conversion for first-time visitors?

List potential solutions:

Reduce unnecessary form fields
Increase load speed of first meaningful paint
Move the CTA up to the primary view port
Change to a bolder headline font
Etc.

Form these solutions into hypotheses using the rules above: “If we move the CTA button into the primary view port, onsite conversion from first-time visitors will increase by 15%.”

From here, you can generate at least one testable hypothesis from each potential solution you’ve listed out. But the first step is to make sure you’re asking the right questions (which you’re going to be swimming in since you’ve laid a great foundation with data).

Action Step: Go back to your original “How might we…” question. Is this the right question you should be asking? Can this be further refined to generate even better solutions/potential hypotheses?

Determining Meaningful KPIs

So often I find companies that are getting focused on KPI’s that don’t actually matter to their business. In the real estate industry, many brokers get fixated on minimizing their Cost per Lead (CPL). While this is an important Leading Indicator (more on this later), it doesn’t actually matter to their business.

A better KPI would be their Cost per Contract (when someone actually buys/sells a house) or the conversion rate of Lead <> Contract. This is tremendously more meaningful because then we can assign a clear dollar value to every lead that we acquire and it also makes that Cost per Lead metric, more meaningful in and of itself. If we know that each contract is worth $3,000 and Leads convert to Contracts at 10%, we want to make sure that our CPL stays well under $300.

Another important aspect to keep in mind is that one KPI above all else should be your north star. Generally, this is going to be revenue, margin, profit, etc because, at the end of the day, you’re a business. Generating record levels of leads or having huge opportunities come into your pipeline is sexy, but if these are not converting to revenue, they essentially don’t matter.

The Importance of Leading Indicators

I’ve mentioned a few times now this concept of “indicators” within the testing and experimentation climate. There are two types: Leading and Lagging Indicators.

Leading Indicators are metrics we can use to get a directional read before the KPI that actually matters can be measured. For example, if your goal is to lose weight and want to get a directional read on how you’re doing, you can look at calories eaten per day or calories burned per day. There isn’t a 100% causal relationship but odds are, if you’re eating less than you’re burning, you’re probably going to lose weight.

Leading indicators are extremely important in a good testing infrastructure for the exact same reason smoke detectors are important. Smoke detectors go off whenever there’s a fire. However, smoke detectors also go off when there might be a fire. When you’re broiling that steak and it makes the fire alarm go off, there’s necessarily danger, but left unchecked, that could turn into a potentially dangerous fire. Thus, we want to structure leading indicators to go off whenever there’s a reasonable chance there could be a fire.

Here’s an example: Going back to the real estate lead generation, if we are testing a new lead source and we see leads coming through at a $200 CPL vs. our usual $15, we can probably say that this isn’t going to be a worthwhile lead source. This may seem obvious but we cannot be certain since real estate deals are long and complex. You may also have 6-12 month sales cycles in your business and therefore being mindful of these leading indicators could mitigate a fair bit of your investment risk.

Lastly, it’s worth mentioning that leading indicators can be a great primary KPI for a test. In general, it’s intelligent to measure a leading indicator (Web Visit <> Lead conversion rate) as your primary test KPI but also give those leads time to move through the funnel beyond the testing period to make sure that everything looks good. This will allow you to have confidence both directionally and verify beyond meaningful doubt that your test has been not only a success in terms of our primary KPI but also for our true north.

Action Step: As a thought experiment, think of a few different variables that could be leading indicators for your true north. How do these directly or indirectly influence your metric? How might you improve on these to get some quick wins?

False Positives and Negatives

In testing, there will be instances where we can potentially fall victim to false positives or false negatives. A false positive is a test result that appears to be positive (read: successful) but there exist other confounding variables that we haven’t accounted for which could influence the outcome of the experiment.

If I am A/B testing retail stores during the month of December and I put a giant Santa outside, I bet this would increase the amount of guests I receive in my store by a significant margin. Further, this would also measurably increase revenue so a simple conclusion that I could draw from this experiment is that putting a giant Santa outside all my stores will increase revenue by 20%. However, this only works because of seasonal factors influencing consumer behavior. And if we would roll this out year round to all our stores, it would likely hurt revenue in the long run. Thus, this is an example of a false-positive.

A false negative is the opposite case. It’s a test result that appears to be a failure but is also a success. Imagine that you’re advertising on a niche podcast. It’s expensive and you aren’t really generating a lot of leads. As a result, your Cost per Lead is going to be really high and as a leading indicator, it may look like a failure. But if we continue monitoring this cohort of leads as they move through the sales process, we see they move from Lead <> Customer at an 80% rate. The metric that matters here is Marketing ROI and we see this metric shoot through the roof. If we had relied only on our leading indicator, we would have suffered a false positive and missed a valuable opportunity.

Prioritization Framework

If you’re following all the steps in this article by setting up excellent data infrastructure, asking the right questions, formulating hypotheses, etc., you’re going to have so many tests to run that there simply isn’t enough resources, traffic, or time to execute upon all of them. The step to combat this problem is by building a prioritization framework into the system.

For testing, I always found using the ICE Framework to be most relevant. ICE stands for Impact, Confidence, Ease. The goal is to standardize tests in such a way that they can be treated equally. Each of these metrics is evaluated on a scale from 1 through 10. For Impact, 1 is least impact whereas 10 is maximum impact. For Confidence, 1 is the complete lack of confidence whereas a 10 is confidence beyond a shadow of a doubt. And lastly, for Ease, a 1 is the most challenging thing you’ve ever done whereas a 10 is a walk in the park. The goal of this framework is to highlight the lowest-hanging fruit.

Action step: Create a spreadsheet with 6 columns: Test ID (number), Hypothesis, Impact, Confidence, Ease, Composite.

This will allow you to connect to a wider database of test details on Test ID as the primary key. Assign ICE values to each hypothesis using whatever heuristic you decide and then sum these values in the Composite column. As a result, you now have a sortable list of your highest priority tests.

Putting it all Together

The goal of this entire article is to give you the tools to build your own system for testing. I believe that every organization, no matter if you aren’t testing anything or have 50+ concurrent tests, can get better at this. The whole point of A/B testing is to find a hundred small wins that stack together to exponential increase the growth of your business. While you may get some huge winners that radically change your acquisition, throughput, or margin, you’ll more than likely just find 100 things that can be optimized.

If you’re starting from zero here and want to know what to use, I would highly recommend using the Hubspot, Optimizely and any other well known marketing automation platforms as they have everything you need to run tests directly within the platform. No matter what your business is, I’m a huge proponent of their platform and have worked in it for most of my professional career across sales, marketing, automation, operations, etc. They have amazing support and strategy teams that are more than willing to help you get started.

I’ve included action steps for nearly each section of this post so hopefully you’ve had at least one meaningful takeaway that you can implement within your organization today. If you have any questions or want to chat, please don’t hesitate to reach out directly to me at x@alexsteeno.com.

Alex Steeno

Advisor @ Galactic Fed