How to extract data from Mailchimp API

How to extract data from Mailchimp API

Our favorite email marketing monkey. Mailchimp is the heart of our marketing campaigns. But it generates data and sometimes these data are important to be together with data from services like billing or social. How can you get them and then load them to a data warehouse like Amazon Redshift for further analysis?

About Mailchimp

Mailchimp helps businesses observe their subscribers’ activities, send automated emails to them based on their behavior and preferences, optimize and target the appropriate audience for each campaign using specific tools, and monitor sales and website activity with revenue reports. Companies can add content and collaborate on campaigns that fit their brand using MailChimp’s Email Designer; edit campaigns, collaborate with their teams using MailChimp Editor; send one-to-one messages using Mandrill; collect signups from their tablet using Chimpadeedoo; and access all the services via mobiles using MailChimp Mobile.

About Mailchimp API

[update 18/2] Mailchimp updated its API to v3, for more information visit MailChimp API v3.0 documentation.

mailchimp api

MailChimp was always a promoter of APIs and encouraged integration with other systems. It has a rich API that exposes a large number of endpoints for interacting with the resources of the applications.

There are endpoints for the following resources:

  • Automations – offers functionality related to automated tasks that we define on MailChimp. For example it llets you build a series of triggered emails that are sent to subscribers over a set period of time.
  • Batch operations – for managing batch processes on our MailChimp account
  • Campaign folders – helps you organise your campaigns into folders
  • Campaigns – for managing your campaigns
  • Conversations – helps you view subscribers’ replies to your campaigns in your MailChimp account
  • File manager files – is a place to store images, documents, and other files you include or link to in your campaigns, templates, or signup forms.
  • File manager folders – for creating folders to organise your assets
  • ListsCRUD operations on lists of users
  • Reports – for accessing reports with statistics on your campaigns. All Reports endpoints are read-only. MailChimp’s campaign and Automation reports analyze clicks, opens, subscribers’ social activity, e-commerce data, and more.
  • Template folders – operations on creating folders for organising your templates for your emails
  • Templates – operations on templates for your emails (A template is an HTML file used to create the layout and basic design for a campaign).

Where are the users?

The MailChimp API doesn’t have a root level resource for Users or Subscribers. Users are not considered a stand alone resource but instead they exist only inside the lists.

Note: It is important to note here that every service perceives the world from a different perspective, relevant to the value it offers. So even if what we care about is information about our users, it makes completely sense for a service like MailChimp to organise everything around lists or campaigns. If we check the model of another service, like Intercom, we’ll notice that the user is at the top resources, which again makes sense because Intercom is all about one to one communication with users.

mailchimp api

Interact with Mailchimp API

Interacting with the MailChimp REST API can be done by using tools like CURL or Postman or by using http clients for your favourite language or framework. A few suggestions:

The MailChimp REST API supports oAuth 2.0 authentication, more information can be found in the Authorised apps section of the API documentation. After you successfully authenticate with the REST API, you have to start interacting with its resources and start fetching data from it in order to load them on your data warehouse.

Extract your user data from the MailChimp API

MailChimp and any other service that you might be using, has figured out (hopefully) the optimal model for its operations. But when we fetch data from them we usually want to answer questions or do things that are not part of the context that these services operate.

For this reason we should always keep in mind that when we work with data coming from external services we need to re-model it and bring it to the right form for our needs.

Let’s assume that we want to get all the information we got on MailChimp for our users in order to enrich our user records inside our data warehouse. To do that we need to do the following.

First we need to fetch all the lists that we have created on MailChimp, we can do this by performing a GET request to the appropriate endpoint.

curl --request GET \
--url 'https://usX.api.mailchimp.com/3.0/lists' \
--user 'anystring:apikey' \
—include

We should get back a response like the following:

{
  "lists": [
    {
      "id": "57afe96172",
      "name": "Freddie's Jokes",
      "contact": {
        "company": "MailChimp",
        "address1": "675 Ponce De Leon Ave NE",
        "address2": "Suite 5000",
        "city": "Atlanta",
        "state": "GA",
        "zip": "30308",
        "country": "US",
        "phone": ""
      },……

Using the ID we get from the response for each list, we iterate through all the lists and make requests to the appropriate end-points to get the members for list

curl --request GET \
--url 'https://usX.api.mailchimp.com/3.0/lists/57afe96172/members' \
--user 'anystring:apikey' \
—include

and we should get a response back like the following:

{
  "members": [
    {
      "id": "f777bbffab8d1ceca8b757df63c47cb8",
      "email_address": "urist.mcvankab+1@freddiesjokes.co",
      "unique_email_id": "882e9bec19",
      "email_type": "html",
      "status": "subscribed",
      "status_if_new": "",
      "merge_fields": {
        "FNAME": "",
        "LNAME": ""
      },
      "interests": {
        "9143cf3bd1": true,
        "3a2a927344": false,
        "f9c8f5f0ff": false,
        "f231b09abc": true,
        "bd6e66465f": false
      },…………

Keep in mind that a user might appear in more than one lists, which means that you also need to reduplicate your results based on this fact.

Schema & Load

After you extract all the information you need, you have to map it to the schema of your data warehouse repository and then load the data to it.

The design of the schema and loading of the data is of most importance as it can both affect the performance of your data warehouse cluster and the questions that you can answer. And of course this procedure differs in the case of Amazon Redshift or Azure SQL Data Warehouse or Google BigQuery.

Takeaways: Which option will work for you?

The way to proceed relies on the data you want to load and the requirements of your use case. Things can get even more complicated if you want to integrate data coming from different sources. You can:

  1. Write, host and maintain a flexible data infrastructure.
  2. Use a product like Blendo that can handle this kind of problems automatically for you. With Blendo you can quickly and safely move all your data from Mailchimp into data warehouse solutions like Amazon Redshift and start generating insights from your data.
Advertisements

9 thoughts on “How to extract data from Mailchimp API

  1. But when I try to extract data, like the list of members after following the above steps, the list does not contain all the ids, that are actually present. There are close to 200 members, but the json only contains the information about 10 members only!! Please help for the same.
    Thank you

  2. Hey Joshua, you need to paginate your results. Mailchimp has a default page size of 10, that’s why you get only 10 results back with your request. As an example:

    curl –request GET \
    –url ‘https://usX.api.mailchimp.com/3.0/lists/57afe96172/members?offset=0&count=200’ \
    –user ‘anystring:apikey’ \
    —include

    the above request will get you back at most 200 results (in case you have less than 200). You will notice that offset is set to 0 and the page size to 200. Alternatively you can make multiple requests using a smaller count but moving the offset to a new start to receive the next page.

    • Thank you so much for your help ! If I encounter any more problems in the future then I will definitely ask on this blog! It’s absolutely great! Thanks !!

    • It worked! thank you very much for replying. I just wanted to ask you one more thing, is there a way for me to get(through API) the list of all the people who opened a particular campaign and the list of all the people who clicked on a particular campaign etc ?? Because we send about 5 campaigns everyday and we want to know which people are clicking or opening what type of campaign. Could you please help me for the same? Thank you.

  3. It worked! thank you very much for replying. I just wanted to ask you one more thing, is there a way for me to get(through API) the list of all the people who opened a particular campaign and the list of all the people who clicked on a particular campaign etc ?? Because we send about 5 campaigns everyday and we want to know which people are clicking or opening what type of campaign. Could you please help me for the same? Thank you.

  4. I cannot extract email-activity of more than 5000 people for a particular campaign using Mailchimp API v3 through R or cURL. Is there a limit for the same or is there a better way to extract data of more than 5000 people for a campaign through API ?
    Thank you.

    • Hey Joshua, thanks for your comments first of all. Kostas is my friend and glad he helped earlier. Where do you want to send your MailChimp data to? You can try http://www.blendo.co (Kostas and me are co-founders). You will get your MailChimp (and other data) in the place you need in no time, with no coding / maintenance hassle for your pipeline. Let me know if you want me to arrange an invitation for you.

  5. this is my batch operation – (“campaign_id” is the id of the particular campaign with 200000 people in it) –

    {
    “operations”: [ { “method”: “GET”, “path”: “/reports/”campaign_id”/email-activity”, “operation_id” : “3” } ]
    }

    for 200,000 people it takes 4.5 hours ! So is there a faster way to GET email-activity and how do GET only 3 fields from email-activity through Batch operations ? Thank you.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s