Tuesday, January 31, 2017

Migrate a WordPress blog into HubSpot

There is a tutorial on how to migrate WordPress blog posts into hubspot. If anything more complicated is needed, e.g. to deal with language variants, here are some commands that will help to automate it. I am also offering the migration as a service.

The situation was that the original blog had language variants in the form of http://blog.company.com/2017/... (without any language specified) for English and http://blog.company.com/ru/2017/... for other languages (in this case, "ru" for Russian).

Therefore, we process English blog items separately:

{ xmlstarlet ed -d '//item[not(contains(link,"blog.avast.com/20"))]' all.xml; echo ""; } > all-en.xml

Other languages are processed automatically with the help of a loop:

for L in cs de es fr it ja pl pt-br ru tr uk ; do export F=all-$L.xml;  { xmlstarlet ed  -d '//item[not(contains(link,"blog.avast.com/'$L'/"))]' all.xml; echo ""; } > "$F"; sed -r -i "s:(<link.*blog.avast.com/)$L/:\1:g" $F; done

There are redirects to be set up in the HubSpot settings so that the old URLs are still accessible and they will point to existing articles (new locations). This helps to maintain ranking in search engines and does not break links from other sites.

# english without a cycle
L="" perl -ne 'if(/^http:\/\/blog.avast.com(.*),http:\/\/avast.hs-sites.com(.*)$/ && not ($1 eq $2)) { print "$1,$2\n"}' > redirects-languages.txt


# other languages
for L in de it fr ru pl pt-br cs es; do
perl -ne 'if(/^http:\/\/blog.avast.com(.*),http:\/\/avast.hs-sites.com(.*)$/ && not ("/'$L'$1" eq $2)) { print "/'$L'$1,$2\n"}' >> redirects-languages.txt
done


One may have to deal with CDATA:

CDATA tag replace in ViM:

:%s#\(<script.*\)// <!\[CDATA\[#\1#
:%s#// \]\]\]\]><!\[CDATA\[></script>#</script>#

When uploading new redirects (which is possible to do in bulk), we deleted the old redirects first. It is possible to automate it with an iMacros script:

VERSION BUILD=8940826 RECORDER=FX
TAB T=1
URL GOTO=https://app.hubspot.com/content/486579/settings/url-mappings
TAG POS=1 TYPE=SPAN ATTR=CLASS:dropdown-targetsettings-icon&&TXT:
TAG POS=11 TYPE=A ATTR=TXT:Delete
TAG POS=1 TYPE=A ATTR=ID:hs-fancybox-ok

With these commands, it was possible to migrate thousands of blog posts from WordPress into HubSpot.