User:Mikk/Scripts

From Warcraft Wiki
Jump to navigation Jump to search

The following is a collection of Mikk's scripts for managing various things around the wiki. Mostly, but not only, relating to Interface Customization.


Nearly all of the scripts require GNU AWK 3.1. A standalone win32 "gawk.exe" is available in the unxutils package. As of the writing of this page, gawk 3.1 is in UnxUpdates.zip.

(I tend to use AWK simply because of this single exe download for Win32)


Converting scripts to wikitext

Heh, I needed a script to show my scripts =)

# Usage: gawk -f wikifycode.awk input.txt > output.txt

BEGIN {
  print "<div style=\"max-width: 80em; margin-right: 2em; height: 20em; overflow: scroll;\">";
}



 {
  gsub(/&/, "\\&amp;");
  gsub(/</, "\\&lt;");
  gsub(/>/, "\\&gt;");
  gsub(/{{/, "{\\&#x7b;");
  gsub(/''/, "'\\&#x27;");
  gsub(/\[\[/, "[\\&#x5b;");
  gsub(/http:/, "http\\&#x3a;");
  gsub("__", "_\\&#x5f;");
  
  gsub(/USERNAME *= *".*"/, "USERNAME = \"''YOURUSERNAME''\"");
  gsub(/PASSWORD *= *".*"/, "PASSWORD = \"''YOURPASSWORD''\"");
  
  print " " gensub(/^( *)(\t)/, "\\1  ", "g", $0);
  
}


END {
  print "</div>";
}


Create Global Function List

In-game scanner

Install as an addon, start WoW, copy-and-paste resulting text to "funcscan.txt".

Interface\Addons\!!!GlobFuncScan\!!!GlobFuncScan.toc

## Interface: 20300
## Title: Global Function Scanner
## Notes: Find all global WoW functions
## Author: Mikk
## SavedVariables: 
GlobFuncScan.lua

Interface\Addons\!!!GlobFuncScan\GlobFuncScan.lua

--[[
  Global Function Scanner addon by Mikk
  See http://www.wowwiki.com/User:Mikk/Scripts
  
  Up to date for WoW 2.4. Produces some extras but that gets filtered out by later scripts.
]]
if(not GlobFuncEdit) then
    GlobFuncEdit = CreateFrame("Editbox");
end
GlobFuncEdit:SetFontObject(GameFontHighlightSmall);
GlobFuncEdit:SetPoint("TOPRIGHT", UIParent, "TOPRIGHT", -10, -10);
GlobFuncEdit:SetPoint("TOPLEFT", UIParent, "TOPRIGHT", -250, -10);
GlobFuncEdit:SetHeight("500");
GlobFuncEdit:SetMultiLine(true);
GlobFuncEdit:SetScript("OnEscapePressed", function() this:Hide(); end);

local function funcaddr(func)
  return tonumber(strsub(tostring(func), 10), 16)
end

local refpoint={}
local function point(funcname)
  refpoint[funcname]=funcaddr(_G[funcname])
end

point("DeclineGroup")
point("FlagTutorial")
point("ConvertToRaid")
point("FlagTutorial")
point("ShowLFG")
point("asin")
point("pairs")
point("AcceptQuest")

local res = {}
for k,v in pairs(_G) do
    if type(v)=="function"	--[[ and strfind(k, "^_*[A-Za-z0-9]+$") or ]] then
  	local addr = funcaddr(v)
  	for _,refaddr in pairs(refpoint) do
  		if abs(addr-refaddr)<300000 then
  			tinsert(res, k);
  			break
  		end
  	end
    end
end
table.sort(res);
for k,v in pairs(refpoint) do
  table.insert(res, 1, format("# %-15s %10u (0x%08x)", k, v, v));
end
table.insert(res, "# END")
local str = table.concat(res, "\n");
DEFAULT_CHAT_FRAME:AddMessage("GlobFuncScan: Found " .. #res .. " functions. Total output length is " .. strlen(str) .. " bytes.");
GlobFuncEdit:SetText(str);

GlobFuncEdit:Show();
GlobFuncEdit:SetFocus(true);
GlobFuncEdit:HighlightText(0, 999999);



List creator

Runs outside of WoW. You need a stand-alone Lua parser.

funcscan.lua

wowexe = "C:/program files/world of warcraft/wow.exe";
funcscan = "funcscan.txt";

skip={
  ["message"] = true,
  ["GetText"] = true,
}

forcedfuncs={
  ["SortLFG"]=true,	-- this doesnt get picked up by the exe scanner like it should, it says "@ASortLFG". doh.
}


funcs={}

f = assert(io.open(funcscan, "rt"), "could not open '" .. funcscan .. "' for read");
for str in f:lines() do
  if not string.match(str, "^#") then
  	table.insert(funcs, str);
  end
end
f:close();
table.sort(funcs);



f = assert(io.open(wowexe, "rb"), "could not open '" .. wowexe .. "' for read");
wow = f:read("*a");

wowstrings = {}
for str in string.gmatch(wow, "[A-Za-z0-9_][A-Za-z0-9_][A-Za-z0-9_]+") do
  wowstrings[str] = true;
  
end


for _,str in ipairs(funcs) do
  if (wowstrings[str] and not skip[str]) or forcedfuncs[str] then
  	print("* [[API "..str.."|"..str.."]]");
  else
  	io.stderr:write("skipping ",str,"\n")
  end
end



Run as lua funcscan.lua > globfunc.txt


Manual work

  • Run the boldizer (below) on the result
  • Manually list new/removed functions in the Changes section at the top of the page (hint: Use the "Show Changes" button!)


Boldizing Global Function List entries

Copy wikitext contents to files:

(The only important thing is that the global function list file has "glob" somewhere in its name, and that it is the last file in the list)

#
# Usage: gawk -f boldizeglobfuncs.awk luafuncs.txt wowapi.txt globfunc.txt > globfunc-fixed.txt
#
# Will correctly boldize Global Function List entries that do not occur in earlier files
# Assumes that the global functions list file has "glob" somewhere in its name (and others do not)
#

FILENAME!=LastFileName {
  LastFileName = FILENAME;
  IsGlobals = ( tolower(FILENAME) ~ /glob/ );
}

!IsGlobals && match($0, /\[\[(API[_ ][A-Za-z0-9_ ]+)/, a) {
  # Remember that we have seen this function
  gsub("_", " ", a[1]);
  api[a[1]]=1;
}

IsGlobals {
  # Fix dates occuring in the global function list
  if(/'' *Functions in bold are not .* as of .*''/) {
    gsub(/ as of .*''/, strftime(" as of %d %B %Y''"));
  }
  
  # Boldize (or not) depending on whether we have seen the API in use
  if(match($0, /\[\[(API[_ ][A-Za-z0-9_ ]+)/, a)) {
    gsub("_", " ", a[1]);
    if(!api[a[1]])
      $0 = gensub(/(''')?\[\[(API[_ ].*)\]\](''')?/, "'''[[\\2]]'''", 1);
    else
      $0 = gensub(/(''')?\[\[(API[_ ].*)\]\](''')?/, "[[\\2]]", 1);
  }
  
  print;
}


Finding non-existant APIs

This can be used to find non-existant (typoed or removed) APIs in e.g. World of Warcraft API, Widget API or Lua functions.

Copy wikitext contents to files:

(The only important thing is that the global function list file has "glob" somewhere in its name, and that it is the first file in the list)

#
# Usage: gawk -f findbadfuncs.awk globfunc.txt somefile.txt
#
# Finds functions that are NOT mentioned in the Global Function List
# Will also complain about:
# - FrameXML Object methods
# - FrameXML Variables 
# Just ignore those for now, there are not too many of them (yet)
#

FILENAME!=LastFileName {
  LastFileName = FILENAME;
  IsGlobals = ( tolower(FILENAME) ~ /glob/ );
}

IsGlobals && match($0, /\[\[(API[_ ][A-Za-z0-9_ ]+)/, a) {
  gsub("_", " ", a[1]);
  api[a[1]]=1;
}

!IsGlobals && match($0, /\[\[(API[_ ][A-Za-z0-9_ ]+)/, a) {
  gsub("_", " ", a[1]);
  if(!api[a[1]])
    print;
}


Mismatching descriptions of duplicate API entries

This script finds duplicate API entries, but where the description or argument list differs. You can of course give it multiple files and it will detect across files.

Copy wikitext contents to files:

  • Whatever page you want to test → somefile.txt
(World of Warcraft API, Widget API, Lua functions)
# Usage: gawk -f findduplicates.awk somefile.txt
#
# Will output duplicate API links where the text does NOT match,
# using a formatting that becomes clickable in most IDEs.
# (Visual Studio will probably want "filename.ext(123):" though. Change it below.)
#

$0!~/<!--/ && match($0, /^ *[:*] *\w* *\[\[(API[_ ][A-Za-z0-9_ ]+)/, a) {
  gsub("_", " ", a[1]);
  
  posstr=sprintf("%s:%4u: ", FILENAME, NR);
  # M$: posstr=sprintf("%s(%4u): ", FILENAME, NR);
    
  if(pos[a[1]]) {
    if(line[a[1]]==$0) { 
      equal++; 
      next; # Comment out this line of you want to see equal dups too
    }
    else
      nonequal++;
    
    print posstr $0 "\n" pos[a[1]] line[a[1]]
  } else  {
    pos[a[1]]=posstr;
    line[a[1]]=$0;
  }
}

END {
  print ""
  print "Total: "
  print "  " equal " equal duplicates."
  print "  " nonequal " nonequal duplicates."
}


Village pump summary to Project:Community portal

This will only work under Unix (Linux). It requires that curl is installed.

#!/usr/bin/gawk -f

# Goddamn, why did I get the idea to write this in Awk. It is SO not up to the task. Should have used php or perl instead. --Mikk

BEGIN {
  MAXUSERS = 4;
  MAXSUBJS = 5;
  USERNAME = "YOURUSERNAME";
  PASSWORD = "YOURPASSWORD";
}

# Grmbl. Imperial date crap. This would have been a 1-line string comparison if it was iso dates.

function monthnum(name) {
  if(name ~ /[Jj]an/) return 1;
  if(name ~ /[Ff]eb/) return 2;
  if(name ~ /[Mm]ar/) return 3;
  if(name ~ /[Aa]pr/) return 4;
  if(name ~ /[Mm]ay/) return 5;
  if(name ~ /[Jj]un/) return 6;
  if(name ~ /[Jj]ul/) return 7;
  if(name ~ /[Aa]ug/) return 8;
  if(name ~ /[Ss]ep/) return 9;
  if(name ~ /[Oo]ct/) return 10;
  if(name ~ /[Nn]ov/) return 11;
  if(name ~ /[Dd]ec/) return 12;
  return 0;
}


# Wants "12 May 2006", "9 Jun 2006"
function dateval(datestr,   t) {
  split(datestr, t, " "); 
  return (int(t[3])*10000)+(monthnum(t[2])*100)+int(t[1])
}




function storesubj(subj, users) { 
  if(subj=="") return;

  # Sort and count users
  asort(users);
  n=0;
  for(u in users)
    n++;

  # Extract last MAXUSERS as a single line
  a[1]=0;
  userstr = "";
  i=n+1-MAXUSERS; if(i<1) i=1;
  for( ; i<=n ; i++) {
    split(users[i], a, SUBSEP);
    ################# THIS IS THE FORMAT OF USER RECORDS ####################
    if(userstr!="") userstr=userstr " &middot; ";
    userstr = userstr a[2] " " a[3];
  }

  # Store
  subjs[subj] = a[1] SUBSEP subj SUBSEP userstr;
}



BEGIN {
  IGNORECASE=1;
  cmd="curl --max-time 180 --header 'Expect:' http://www.wowwiki.com/Special:Export/WoWWiki_talk:Village_pump";
  while((cmd | getline $0)>0) {
    if($0 ~ /<text /)
      cap=1;

    if(!cap)
      continue;
    if(/<\/text>/)
      cap=0;
      
    gsub("&lt;", "<");
    gsub("&gt;", ">");
    gsub("&quot;", "\"");
    gsub("&amp;", "\\&");

    if(match($0, /^=+ *([^=]+) *=+/, a)) {
      newsubj=a[1];
      storesubj(subj, users);
      subj=newsubj;
      delete users;
    }

    if(match($0, /(\[\[user:[^\]]+\]\]) .* ([0-9][0-9]? \w+ 20[0-9][0-9])/, a) && subj!="") {
      if(dateval(a[2]) > int(users[a[1]]))
        users[a[1]] = dateval(a[2]) SUBSEP a[1] SUBSEP a[2];
    }

  	if(!cap)
  		break;
  }
  close(cmd);

  storesubj(subj, users);


  # Now, extract the latest MAXSUBJS subjects and output
  asort(subjs);
  n=0;
  for(s in subjs)
    n++;

  out = "";
  i=n+1-MAXSUBJS; if(i<1) i=1;
  for( ; i<=n ; i++) {
    split(subjs[i], a, SUBSEP);
    ################## THIS IS THE FORMAT OF SUBJECT RECORDS ################
    out = out "* <b>" a[2] "</b>\n";
    out = out ":" a[3] "\n";
    out = out "\n";
  }

  # Final touches to the output

  print "<!-- This is automatically generated. Editing it might be pointless. -->" > "tmp";
  print out > "tmp";
  close("tmp");

  # Login

  print "\n\n\n----- LOGGING IN -----\n";
  cmd="curl --max-time 180 --header 'Expect:' --location 'http://www.wowwiki.com/index.php?title=Special:Userlogin&action=submitlogin' --form 'wpName=" USERNAME "' --form 'wpPassword=" PASSWORD "' --form 'wpLoginattempt=Log in' --cookie-jar cookiejar | grep -A 15 '<body'";
  print "" | cmd;
  close(cmd);


  # Grab hold of the wpEditToken
  print "\n\n\n----- GETTING wpEditToken -----\n";
  cmd="curl --max-time 180 --header 'Expect:' 'http://www.wowwiki.com/index.php?title=WoWWiki_talk:Village_pump/Summary&action=edit' --cookie cookiejar";
  while((cmd | getline $0)>0) {
    if(match($0, /value="([0-9a-f\\]+)" name="wpEditToken"/, a))
      wpEditToken = a[1];
    if(match($0, /value="([0-9a-f]+)" name="wpEdittime"/, a))
      wpEdittime = a[1];
    if(match($0, /value="([0-9a-f]+)" name="wpStarttime"/, a))
      wpStarttime = a[1];
  }
  close(cmd);

  if(wpEditToken=="") {
    print "ERROR: wowwiki wouldn't give me its wpEditToken!";
    exit(1);
  }

  # Post the page
  print "\n\n\n-----  POSTING CHANGES -----\n";
  cmd="curl --max-time 180 --header 'Expect:' 'http://www.wowwiki.com/index.php?title=WoWWiki_talk:Village_pump/Summary&action=submit' --cookie cookiejar --form 'wpTextbox1=<tmp' --form 'wpSummary=automated upload' --form 'wpMinoredit=0' --form 'wpSave=Save page' --form 'wpSection=' --form 'wpEdittime="wpEdittime"' --form 'wpStarttime="wpStarttime"' --form 'wpEditToken="wpEditToken"'";
  print cmd;
  print "" | cmd;
  close(cmd);

  # Force the Community portal to refresh
  print "\n\n\n----- PURGING COMMUNITY PORTAL -----\n";
  cmd="curl --max-time 180 --header 'Expect:' 'http://www.wowwiki.com/index.php?title=WoWWiki:Community_portal&action=purge' | grep -A 50 'Recent talk in the'";
  print "" | cmd;
  close(cmd);

}


Events/A et al to Events/Action Bar et al and Events/Names

Automatically generates events-by-category pages and the simple name list from the alphabetically indexed pages (the originals). Is run a couple of times daily.

This will only work under Unix (Linux). It requires that curl is installed.

#!/usr/bin/gawk -f

BEGIN {
  USERNAME = "YOURUSERNAME";
  PASSWORD = "YOURPASSWORD";
  
  KnownCat["Action Bar"] = 1;
  KnownCat["Auction"] = 1;
  KnownCat["Bank"] = 1;
  KnownCat["Battleground"] = 1;
  KnownCat["Buff"] = 1;
  KnownCat["Combat"] = 1;
  KnownCat["Communication"] = 1;
  KnownCat["Death"] = 1;
  KnownCat["GlueXML"] = 1;
  KnownCat["Guild"] = 1;
  KnownCat["Honor"] = 1;
  KnownCat["Instance"] = 1;
  KnownCat["Item"] = 1;
  KnownCat["Loot"] = 1;
  KnownCat["Mail"] = 1;
  KnownCat["Map"] = 1;
  KnownCat["Misc"] = 1;
  KnownCat["Movement"] = 1;
  KnownCat["Party"] = 1;
  KnownCat["Pet"] = 1;
  KnownCat["Player"] = 1;
  KnownCat["Quest"] = 1;
  KnownCat["Skill"] = 1;
  KnownCat["Spell"] = 1;
  KnownCat["System"] = 1;
  KnownCat["Tooltip"] = 1;
  KnownCat["Trade"] = 1;
  KnownCat["Tradeskill"] = 1;
  KnownCat["Trainer"] = 1;
  KnownCat["Unit Info"] = 1;
  
  curl="curl --max-time 180 --silent --show-error --cookie-jar cookiejar --cookie cookiejar --header 'Expect:' ";

  RS = "\r?\n";
}


#
# TitleEncode() - replace a few strategic characters that are likely to show up and cause problems. not a full url encoder.
#

function TitleEncode(page) {
  gsub(" ", "_", page);
  gsub("&", "%26", page);
  gsub("?", "%3f", page);
  gsub(/\(/, "%28", page);
  gsub(/\)/, "%29", page);
  return page;
}

#
# GetPage() - get the full contents of a page via Special:Export
#

function GetPage(page,    cmd,cap,ret) {
  page = TitleEncode(page);
  ret=""
  cmd=curl "'http://www.wowwiki.com/Special:Export/" page "'";
  print cmd;
  while((cmd | getline $0)>0) {
    if($0 ~ /<text /) {
      gsub(/.*<text [^>]*>/, "");
      cap=1;
    }

    if(!cap)
      continue;

    if(/<\/text>/) {
      gsub(/<\/text>.*/, "");
      ret = ret $0;
      break;
    }
    
    ret = ret $0 "\n";
  }
  close(cmd);
  gsub("&gt;", ">", ret);
  gsub("&lt;", "<", ret);
  gsub("&quot;", "\"", ret);
  gsub("&amp;", "\\&", ret);
  return ret;
}


#
# Login() Log in. Will exit(0) on failure.
#

function Login() {
  headers = "headers.tmp";
  print "" > headers;
  close(headers);
  cmd=curl " --location 'http://www.wowwiki.com/index.php?title=Special:Userlogin&action=submitlogin' --form 'wpName=" USERNAME "' --form 'wpPassword=" PASSWORD "' --form 'wpLoginattempt=Log in' --dump-header " headers " > /dev/null";
  print cmd;
  print "" | cmd;
  close(cmd);
  
  while((getline < headers)>0) {
    if(/^Set-Cookie: wowwikiUserID=[0-9]/) {
    	close(headers)
    	return;	# success!
    }
  }
  print "ERROR: Login failure!";
  exit(1);
}


#
# PutPage() - push a new revision of the given page
#

function PutPage(page, content) {
  page = TitleEncode(page);
  # Grab hold of the wpEditToken
  cmd=curl " 'http://www.wowwiki.com/index.php?title=" page "&action=edit'";
  print cmd;
  while((cmd | getline $0)>0) {
    if(match($0, /value="([0-9a-f\\]+)" name="wpEditToken"/, a))
      wpEditToken = a[1];
    if(match($0, /value="([0-9a-f]+)" name="wpEdittime"/, a))
      wpEdittime = a[1];
    if(match($0, /value="([0-9a-f]+)" name="wpStarttime"/, a))
      wpStarttime = a[1];
  }
  close(cmd);	
  
  if(wpEditToken=="") {
    print "ERROR: wowwiki wouldn't give me its wpEditToken!";
    exit(1);
  }

  # Post the page
  tmpfile = "post.tmp";
  print content > tmpfile;
  close(tmpfile);
  cmd=curl " 'http://www.wowwiki.com/index.php?title=" page "&action=submit' --form 'wpTextbox1=<"tmpfile"' --form 'wpSummary=automated upload' --form 'wpMinoredit=0' --form 'wpSave=Save page' --form 'wpSection=' --form 'wpEdittime="wpEdittime"' --form 'wpStarttime="wpStarttime"' --form 'wpEditToken="wpEditToken"'";
  print cmd;
  print "" | cmd;
  close(cmd);
}


#
# PurgePage() - do "action=purge" on a page to pull in changes in templates
#

function PurgePage(page) {
  page = TitleEncode(page);
  cmd=curl " 'http://www.wowwiki.com/index.php?title=" page "&action=purge' > /dev/null";
  print cmd;
  print "" | cmd;
  close(cmd);
}





BEGIN {
  
  Login();
  
  for(i=0x41;i<=0x5a;i++) {
  	c=sprintf("%c",i);
  	if(1) {
  		txt = GetPage("Events/" c);
  	}
  	else {
  		txt="";
  		while((getline < c)>0)
  			txt = txt $0 "\n";
  		close(c);
  	}
  	
  	# Trim out the inserted dummy headers that are only there to get [Edit] links at regular intervals
  	gsub(/[ \n]*\|}[ \n]*===+ +===+[ \n]*{\|[ \n]*/, "", txt);
  	
  	# Split on "{{evt" to get one event per array entry
  	n = split(txt, a, /{{evt/);
  	
  	AllEvents = AllEvents "\n== " c " ==\n"
  	
  	for(ei=1;ei<=n;ei++) {
  		if(match(a[ei], /\|([A-Z_]+)\|([a-zA-Z_, ]+)}}( *\n)+/, parms)) {
  			name = parms[1]
  			header = substr(a[ei], 1, RSTART+RLENGTH);
  			# print "Event: '" name "'  Categories:'" parms[2] "'";
  			
  			gsub(/^ */, "", parms[2]);
  			gsub(/ *$/, "", parms[2]);
  			split(parms[2], categories, / *, */);
  			# for(ci in categories) print "C: '" categories[ci] "'"
  			
  			txt = substr(a[ei], RSTART+RLENGTH);
  			gsub(/( *\n)*$/, "", txt);
  			# print txt;
  			
  			AllEvents = AllEvents "\n* " name " &nbsp; &rarr; [[Events/" c "|" c "]] <small>"
  			
  			for(ci in categories) {
  				cat = categories[ci];
  				if(!KnownCat[cat])
  					print "Warning: '" name "': Unknown category: '" cat "'";
  				else {
  					CatPage[cat] = CatPage[cat] "{{evt|" name "|" parms[2] "}}\n\n" txt "\n\n\n"
  					AllEvents = AllEvents "&middot; [[Events/" cat "|" cat "]] ";
  				}
  			}
  			
  			AllEvents = AllEvents "</small>\n"
  			
  		}
  	} # END for(ei=1;ei<n;ei++)
  	
  }
  
  if(1) for(cat in CatPage) {
  	PutPage("Events/" cat, 
  		"__NOTOC____NOEDITSECTION__{{eventlistheader}}\n" \
  		"<!-- DO NOT EDIT. THIS IS AN AUTOMATICALLY GENERATED PAGE. -->\n" \
  		"<!-- DO NOT EDIT. THIS IS AN AUTOMATICALLY GENERATED PAGE. -->\n" \
  		"<!-- DO NOT EDIT. THIS IS AN AUTOMATICALLY GENERATED PAGE. -->\n" \
  		"\n\n" \
  		":{{icon-information}}Note that this page is automatically generated; editing it is pointless. To edit event descriptions, edit the entries in the alphabetical pages, e.g. [[Events/A]], [[Events/B]], etc. Changes there will be copied over to here within a few hours.\n" \
  		"\n\n" \
  		"== " cat " related events ==\n\n" \
  		CatPage[cat] );
  }
  
  PutPage("Events/Names",
  	"__NOEDITSECTION__{{tocright}}[[Category:API Events| ]]\n" \
  	"<!-- DO NOT EDIT. THIS IS AN AUTOMATICALLY GENERATED PAGE. -->\n" \
  	AllEvents	);
  
}


RC bot

Takes Special:RecentChanges and reformats into WoWWiki:RC with WoWWiki:RC/Skip users removed, and also analyzes the diffs to try and find vandals / abusive text.

This will only work under Unix (Linux). It requires that curl is installed.

#!/usr/bin/gawk -f

BEGIN {
  USERNAME = "YOURUSERNAME";
  PASSWORD = "YOURPASSWORD";
  
  curl="curl --max-time 180 --silent --show-error --cookie-jar cookiejar --cookie cookiejar --header 'Expect:' ";

  RS = "\r?\n";
}


#
# TitleEncode() - replace a few strategic characters that are likely to show up and cause problems. not a full url encoder.
#

function TitleEncode(page) {
  gsub(" ", "_", page);
  gsub("&", "%26", page);
  gsub("?", "%3f", page);
  gsub(/\(/, "%28", page);
  gsub(/\)/, "%29", page);
  return page;
}

#
# GetPage() - get the full contents of a page via Special:Export
#

function GetPage(page,    cmd,cap,ret) {
  page = TitleEncode(page);
  ret=""
  cmd=curl "'http://www.wowwiki.com/Special:Export/" page "'";
  print cmd;
  while((cmd | getline $0)>0) {
    if($0 ~ /<text /) {
      gsub(/.*<text [^>]*>/, "");
      cap=1;
    }

    if(!cap)
      continue;

    if(/<\/text>/) {
      gsub(/<\/text>.*/, "");
      ret = ret $0;
      break;
    }
    
    ret = ret $0 "\n";
  }
  close(cmd);
  gsub("&gt;", ">", ret);
  gsub("&lt;", "<", ret);
  gsub("&quot;", "\"", ret);
  gsub("&amp;", "\\&", ret);
  return ret;
}


#
# Login() Log in. Will exit(0) on failure.
#

function Login() {
  headers = "headers.tmp";
  print "" > headers;
  close(headers);
  cmd=curl " --location 'http://www.wowwiki.com/index.php?title=Special:Userlogin&action=submitlogin' --form 'wpName=" USERNAME "' --form 'wpPassword=" PASSWORD "' --form 'wpLoginattempt=Log in' --dump-header " headers " > /dev/null";
  print cmd;
  print "" | cmd;
  close(cmd);
  
  while((getline < headers)>0) {
    if(/^Set-Cookie: wowwikiUserID=[0-9]/) {
    	close(headers)
    	return;	# success!
    }
  }
  print "ERROR: Login failure!";
  exit(1);
}


#
# PutPage() - push a new revision of the given page
#

function PutPage(page, content) {
  page = TitleEncode(page);
  # Grab hold of the wpEditToken
  cmd=curl " 'http://www.wowwiki.com/index.php?title=" page "&action=edit'";
  print cmd;
  while((cmd | getline $0)>0) {
    if(match($0, /value="([0-9a-f\\]+)" name="wpEditToken"/, a))
      wpEditToken = a[1];
    if(match($0, /value="([0-9a-f]+)" name="wpEdittime"/, a))
      wpEdittime = a[1];
    if(match($0, /value="([0-9a-f]+)" name="wpStarttime"/, a))
      wpStarttime = a[1];
  }
  close(cmd);	
  
  if(wpEditToken=="") {
    print "ERROR: wowwiki wouldn't give me its wpEditToken!";
    exit(1);
  }

  # Post the page
  tmpfile = "post.tmp";
  print content > tmpfile;
  close(tmpfile);
  cmd=curl " 'http://www.wowwiki.com/index.php?title=" page "&action=submit' --form 'wpTextbox1=<"tmpfile"' --form 'wpSummary=automated upload' --form 'wpMinoredit=0' --form 'wpSave=Save page' --form 'wpSection=' --form 'wpEdittime="wpEdittime"' --form 'wpStarttime="wpStarttime"' --form 'wpEditToken="wpEditToken"'";
  print cmd;
  print "" | cmd;
  close(cmd);
}


#
# PurgePage() - do "action=purge" on a page to pull in changes in templates
#

function PurgePage(page) {
  page = TitleEncode(page);
  cmd=curl " 'http://www.wowwiki.com/index.php?title=" page "&action=purge' > /dev/null";
  print cmd;
  print "" | cmd;
  close(cmd);
}



#
# GetRCSummary() - retreive a short summary of changes
#

function GetRCSummary(rcid, diffurl) {
  	
  if(!RCs[rcid]) {
  		
  	dellines=0;
  	lol=0;
  	lolol=0;
  	url=0;
  	tinyurl=0;
  	gold=0;
  	powerlevel=0;
  	ninja=0;
  	pwn=0;
  	noob=0;
  	insult=0;
  	fuck=0;
  	penis=0;
  	pussy=0;
  	ass=0;
  	blows=0;
  	gay=0;
  	
  	gsub(/&amp;/, "\\&", diffurl);
  	gsub(/'/, "%27", diffurl);
  	gsub(/"/, "%22", diffurl);
  	gsub(/\{/, "%7b", diffurl);
  	gsub(/\|/, "%7c", diffurl);
  	gsub(/\}/, "%7d", diffurl);
  	
  	rccmd=curl " '" diffurl "'";
  	# print rccmd;
  	bOK=0
  	while((rccmd | getline )>0) {
  		bOK=1
  		if(/<td class='diff-deletedline'>/ && /<td class='diff-addedline'><\/td>/)
  			dellines++;
  		if(match($0, /<td class='diff-addedline'>(.*)<\/td>/, a)) {
  			$0=a[1];
  			# print "ADDED: \"" $0 "\""
  			oldIC = IGNORECASE;
  			IGNORECASE = 1;
  			if(/\yl+o+l+\y/)  lol++;
  			if(/lo+l+o+l/)  lolol++;
  			if(/la+w+l/)  lolol++;
  			if(/fuck/) fuck++;
  			if(/tinyurl\.com/)  tinyurl++;
  			if(/http:\/\// && (! /thottbot.com/) && (! /allakhazam.com/) && (! /worldofwarcraft.com/) )  url++;
  			if(/gold/)  gold++;
  			if(/power.?level/)  powerlevel++;
  			if(/ninja/)  ninja++;
  			if(/pwn/) pwn++;
  			if(/n[o0][o0]+b/ || /\ynub/)  noob++;
  			if(/idiot/ || /moron/ || /bastard/ || /[a@][s$]+[- ]?h[o0]le/ || /schmuck/ || /\yween(ie|s)/)  insult++;
  			if(/f[ue]ck/ || /f[*]ck/ || /f00k/) fuck++;
  			if(/\ydick/ || /p[e3]+n[i1l]+[s$]/ || /\yc[o0]+ck/) penis++;
  			if(/pu[s$][s$]y/ || /pu[s$][s$][i1l][e3][s$]/ || /cu*nt/ || /c[*]nt/) pussy++;
  			if(/\y[a@]ss\y/ || /\y[a@]rse\y/ || /\y[a@]sses\y/ || /[a@][s5][s5]h[a@]t/ || /[a@][s5][s5]c[l1][o0]wn/ || /[a@][s5][s5].?h[o0][l1i][e3]/) ass++;
  			if(/blows/) blows++;
  			if(/ga+y/ || /gh[3e]y/ || /g[3e]hy/) gay++;
  			IGNORECASE = oldIC;
  		}
  	}
  	close(rccmd);

  	if(bOK) {
  		txt=" ";
  		if(dellines>5)
  			txt=txt" &middot; " dellines " deleted lines";
  		if(lol)
  			txt=txt" &middot; \"lol\" x " lol;
  		if(lolol)
  			txt=txt" &middot; \"lolol\" etc x " lolol;
  		if(url)
  			txt=txt" &middot; \"http://\" x " url;
  		if(tinyurl)
  			txt=txt" &middot; \"tinyurl.com\" x " tinyurl;
  		if(gold)
  			txt=txt" &middot; \"gold\" x " gold;
  		if(powerlevel)
  			txt=txt" &middot; \"power level\" x " powerlevel;
  		if(ninja)
  			txt=txt" &middot; \"ninja\" x " ninja;
  		if(pwn)
  			txt=txt" &middot; \"pwn\" x " pwn;
  		if(noob)
  			txt=txt" &middot; \"pwn\" x " noob;
  		if(insult)
  			txt=txt" &middot; \"idiot\" etc x " insult;
  		if(fuck)
  			txt=txt" &middot; \"fuck\" etc x " fuck;
  		if(penis)
  			txt=txt" &middot; \"penis\" etc x " penis;
  		if(pussy)
  			txt=txt" &middot; \"pussy\" etc x " pussy;
  		if(ass)
  			txt=txt" &middot; \"ass\" etc x " ass;
  		if(blows)
  			txt=txt" &middot; \"blows\" x " blows;
  		if(gay)
  			txt=txt" &middot; \"gay\" etc x " gay;
  			
  
  		RCs[rcid] = txt;
  		print rcid "\t'" txt "'";
  		print rcid "\t" txt >> "rcs.txt";
  		fflush("rcs.txt");
  	}
  	else {
  		return "(error getting diff)";
  	}
  }
  

  if(RCs[rcid]==" ")
  	return "";
  return RCs[rcid];
}





BEGIN {
  
  # Read summaries of previously scanned recent changes
  rcfile="rcs.txt";
  while((getline < rcfile)>0) {
  	split($0, a, "\t");
  	RCs[a[1]] = a[2];
  }
  close(rcfile);
  

  # Log in :-)
  Login();
  
  
  # Get list of users to skip
  txt = GetPage("WoWWiki:RC/Skip");
  n = split(txt, a, /\n/);
  for(i=1;i<=n;i++)	{
  	name = a[i];
  	gsub(/^\* +/, "", name);
  	gsub(/ *$/, "", name);
  	aSkip[name]=1;
  }

  
  res = "<noinclude><!-- DO NOT EDIT. THIS IS AN AUTOMATICALLY GENERATED PAGE. -->{{nocat}}</noinclude>\n\n";


  ######## Retreive and parse the recent changes page
  
  rc_tmp = "rc.tmp";
  # Retreive to a temp file or the operation times out while we're scanning changed pages
  system(curl " 'http://www.wowwiki.com/index.php?title=Special:Recentchanges&limit=1500&hidepatrolled=1&days=3' > " rc_tmp);
  
  while((getline < rc_tmp)>0) {
  	time=0;
  	
  	
  	if(match($0, /<h4>(.*)<\/h4>/, a))
  		res = res "\n\n=== " a[1] " ===\n";
  		
  	if(match($0, /(<\/tt>|<strong>)<a href="\/([^"]*)"[^>]*>([^<]*)<\/a>/, a)) {
  		pagelink=a[2];
  		pagetitle=a[3];
  		subitem=0;
  	}
  	else if(match($0, /<tt><a href="\/([^"]*)"[^>]*>([^<]*)<\/a>/, a)) {
  		pagelink=a[1];
  		pagetitle=a[2];
  		time=" ";
  		subitem=1;
  	}
  	else
  		continue;
  		
  	if(subitem && bSkip)
  		continue;
  	
  	pageurl="";
  	if(pagelink ~ /^index.php/) {
  		pageurl="http://www.wowwiki.com/" pagelink
  		page = "[" pageurl " " pagetitle "]";
  	}
  	else {
  		gsub(/^[Cc]ategory:/, ":Category:", pagelink);
  		gsub(/^[I]mage:/, ":Image:", pagelink);
  		page = "[[" pagelink "|" pagetitle "]]";
  	}
  	
  	if(match($0, /<a href="([^"]*)"[^>]*>diff<\/a>/, a))
  		diffurl="http://www.wowwiki.com" a[1];
  	else if(match($0, /<a href="([^"]*)"[^>]*>last<\/a>/, a))
  		diffurl="http://www.wowwiki.com" a[1];
  	else if(match($0, /<a href="([^"]*)"[^>]*>changes<\/a>/, a))
  		diffurl="http://www.wowwiki.com" a[1];
  	else
  		diffurl="";
  	
  	if(match($0, /<a href="([^"]*)"[^>]*>hist<\/a>/, a))
  		histurl="http://www.wowwiki.com" a[1];
  	else if(match($0, /<a href="([^"]*)"[^>]*>Page history<\/a>/, a))
  		histurl="http://www.wowwiki.com" a[1];
  	else
  		histurl="";

  	if(match($0, /<a href="([^"]*)"[^>]*>cur<\/a>/, a))
  		curdiffurl="http://www.wowwiki.com" a[1];
  	else
  		curdiffurl="";

  	if(!time) {
  		if(!match($0, / ([0-9][0-9]:[0-9][0-9]:[0-9][0-9]) /,a))
  			continue;
  		time="<tt>" a[1] "</tt>"
  	}
  	
  	user=""
  	usertxt="";
  	if(match($0, /\. \. <a [^>]*title="User:([^"]*)"/, a)) {
  		user=a[1];
  		usertxt="[[User:" user "|" user "]]";
  	}
  	else if(match($0, /<span class="changedby">\[(.*)\]<\/span>/, a)) {
  		usertxt="[ <small>" gensub(/<a [^>]*title="([^"]*)"[^>]*>([^<]*)<\/a>/, "[[\\1|\\2]]", "g", a[1]) "</small> ]";
  	}
  	
  	# See if we only have skippable users in this bundle (which may not be a bundle; it can be a single entry, but that doesn't matter)
  	if(!subitem) {
  		bSkip = 1;
  		split(usertxt, a, /\[\[/);
  		for(i in a) {
  			if(match(a[i], /User:(.*)\|/, aa)) {
  				if(!aSkip[aa[1]]) {
  					bSkip = 0;
  				}
  			}
  		}
  	}	

  	if(bSkip)
  		continue;
  	
  	if(user)	
  		print "User: " user "  Title: " pagetitle;
  	else
  		print "----------- Title: " pagetitle;
  	
  	if(match($0, /<span class='comment'>\((.*)\)<\/span>/, a)) {
  		comment=a[1];
  		gsub(/<a [^>]*>/, "", comment);
  		gsub(/<\/a>/, "", comment);
  		gsub(/{/, "\\&#x7b;", comment);
  		gsub(/\[\[Category:/, "[[:Category:", comment);
  		comment="&nbsp; (" comment ")"
  	}
  	else
  		comment="";

  	# Get recent change summary (cached or new). This mucks up $0 so keep it last.
  	rcurl = diffurl;
  	if(pageurl ~ /rcid=/)
  		rcurl = pageurl;
  	if(match(rcurl, "rcid=([0-9]+)", a)) {
  		rctxt = GetRCSummary(a[1], rcurl);
  		if(rctxt!="")
  			rctxt = "&nbsp; &nbsp; &nbsp; &nbsp; <b style=\"border-bottom: 1px dotted;\">" rctxt "</b>";
  	}
  	else
  	{
  		rctxt = "";
  	}
  	
  	if(subitem)
  	{
  		res = res ":::* " page " . . <small>[" diffurl " diff] . [" curdiffurl " cur]</small> . . " usertxt " <small>" comment "</small>" rctxt "\n"
  	}
  	else
  		res = res "* " time " " page " . . <small>[" diffurl " diff] . [" histurl " hist]</small> . . " usertxt " <small>" comment "</small>" rctxt "\n"
  		
  	fflush();
  	if(length(res)>=256000) {
  		res = res "\n''RC output length exceeds 256 KB. Ignoring older entries. To see older entries, mark some of the above changes as patrolled and older ones will start appearing on the next update.\n";
  		break;
  	}
  }
  close(rc_tmp);
  
  
  PutPage("WoWWiki:RC/Content", res);
  
  PurgePage("WoWWiki:RC");
}


Forumizing wikitext

Rich formatting

# Usage: gawk -f forumize.awk wikitext.txt > forum.txt
#
# Will attempt to convert wikitext to text suitable for pasting in a forum
# Mostly tested with http://www.wowwiki.com/UI_FAQ to http://wowinterface.com
#

BEGIN {
  RS="\r?\n";
  sectidx=0;
  BIGBIG = "[SIZE=4]"
  _BIGBIG = "[/SIZE]"
  
  BIG = "[SIZE=3]"
  _BIG = "[/SIZE]"
  
  TT = "[FONT=Courier New][COLOR=LightBlue]"
  _TT = "[/COLOR][/FONT]"
}


/^ *$/ && skipblanklines { next }
skipblanklines { skipblanklines = 0 }



match($0, /^(=+) *([^=]+) *=+(.*)/, a) {
  if(a[1]=="=")
  {
    sectidx = sectidx + 1;
    outidx = outidx "[b]" sectidx ". " a[2] "[/b]\n";
    $0 = "" BIGBIG "[u][b]" sectidx ". " a[2] "[/b][/u]" _BIGBIG a[3];
  }
  else if(a[1]=="==")
  {
    outidx = outidx "... - " a[2] "\n";
    $0 = "" BIG "[u][b]" a[2] "[/b][/u]" _BIGBIG a[3];
  }
  else
    $0 = "[u][b]" a[2] "[/b][/u]" a[3];
  
}



/'''/ {
  while(/'''/) {
    sub(/'''/, "[b]");
    if(!sub(/'''/, "[/b]"))
      $0 = $0 "[/b]";
  }
}

/''/ {
  while(/''/) {
    sub(/''/, "[i]");
    if(!sub(/''/, "[/i]"))
      $0 = $0 "[/i]";
  }
}


/^{{faqq}}/ {
  sub(/^{{faqq}} */, "[b]" BIG "[COLOR=Orange]Q:[/COLOR]" _BIG " ");
  $0 = $0 "[/b]";
}

/^{{faqa}}/ {
  sub(/^{{faqa}} */, "[b]" BIG "A:" _BIG "[/b] ");
}


/^:/ {
  sub(/^::/, "        ");
  sub(/^:/, "    ");
}

/^;.*:/ {
  $0=gensub(/^;([^:]+):(.*)/, "[b]\\1[/b]\n\\2", "1", $0);
}
/^;/ {
  $0=gensub(/^;([^:]+)/, "[b]\\1[/b]", "1", $0);
}

/^[^#]/ {
  count=0;
}

/^#/ {
  count=count+1;
  sub(/^# */, count ". ");
}


/\[http:\/\// {
  $0 = gensub(/\[(http:\/\/[^ ]+) ([^\]]+)\]/, "[URL=\\1]\\2[/URL]", "g", $0);
}


/{{/ {
  gsub(/{{[Ee]xample\/Begin}}/, "[quote]");
  gsub(/{{[Ee]xample\/End}}/, "[/quote]");
  
  $0 = gensub(/{{[Ff]aqcredit[|]([^}]+)}}/, "[i](credit: \\1)[/i]", "g", $0);
}


/\[\[#/ {
  $0 = gensub(/\[\[#[^|]+\|([^\]]+)\]\]/, "[i]\\1[/i]", "g", $0);
}


/<div/ { gsub("[\r\n]+$", "\n", out); gsub(/<div [^>]*>/, "[INDENT]"); sub("[\r\n]+", "\n"); skipblanklines=1; }
/<\/div>/ { gsub("[\r\n]+$", "\n", out); gsub(/<\/div>/, "[/INDENT]"); sub("[\r\n]+", "\n"); skipblanklines=1; }

/<code>/ || /<tt>/ { gsub(/(<code>|<tt>)/, TT); }
/<\/code>/ || /<\/tt>/ { gsub(/(<\/code>|<\/tt>)/, _TT); }



 {
  out = out "\n" $0
  next;
}

END {
  print outidx;
  print "";
  print out;
}


Basic formatting (e.g. blizzard forums)

# Usage: gawk -f forumize-basic.awk wikitext.txt > forum.txt
#
# Will attempt to convert wikitext to text suitable for pasting in a forum
# with VERY basic markup, e.g. the blizzard WoW forums
#

BEGIN {
  RS="\r?\n";
  sectidx=0;
  BIGBIG = ""
  _BIGBIG = ""
  
  BIG = ""
  _BIG = ""
  
  TT = "[u]"
  _TT = "[/u]"
}


/^ *$/ && skipblanklines { next }
skipblanklines { skipblanklines = 0 }



match($0, /^(=+) *([^=]+) *=+(.*)/, a) {
  if(a[1]=="=")
  {
    sectidx = sectidx + 1;
    outidx = outidx "[b]" sectidx ". " a[2] "[/b]\n";
    $0 = "" BIGBIG "[u][b]" sectidx ". " a[2] "[/b][/u]" _BIGBIG a[3];
  }
  else if(a[1]=="==")
  {
    outidx = outidx "... - " a[2] "\n";
    $0 = "" BIG "[u][b]" a[2] "[/b][/u]" _BIGBIG a[3];
  }
  else
    $0 = "[u][b]" a[2] "[/b][/u]" a[3];
  
}



/'''/ {
  while(/'''/) {
    sub(/'''/, "[b]");
    if(!sub(/'''/, "[/b]"))
      $0 = $0 "[/b]";
  }
}

/''/ {
  while(/''/) {
    sub(/''/, "[i]");
    if(!sub(/''/, "[/i]"))
      $0 = $0 "[/i]";
  }
}


/^{{faqq}}/ {
  sub(/^{{faqq}} */, "[b]Q: ");
  $0 = $0 "[/b]";
}

/^{{faqa}}/ {
  sub(/^{{faqa}} */, "[b]A: [/b]");
}


/^:/ {
  sub(/^::/, "        ");
  sub(/^:/, "    ");
}

/^;.*:/ {
  $0=gensub(/^;([^:]+):(.*)/, "[b]\\1[/b]\n\\2", "1", $0);
}
/^;/ {
  $0=gensub(/^;([^:]+)/, "[b]\\1[/b]", "1", $0);
}

/^[^#]/ {
  count=0;
}

/^#/ {
  count=count+1;
  sub(/^# */, count ". ");
}


/\[http:\/\// {
  $0 = gensub(/\[(http:\/\/[^ ]+) ([^\]]+)\]/, "\\2 ( \\1 )", "g", $0);
}


/{{/ {
  gsub(/{{[Ee]xample\/Begin}}/, "[quote]");
  gsub(/{{[Ee]xample\/End}}/, "[/quote]");
  
  $0 = gensub(/{{[Ff]aqcredit[|]([^}]+)}}/, "[i](credit: \\1)[/i]", "g", $0);
}


/\[\[#/ {
  $0 = gensub(/\[\[#[^|]+\|([^\]]+)\]\]/, "[i]\\1[/i]", "g", $0);
}


/<div/ { gsub("[\r\n]+$", "\n", out); gsub(/<div [^>]*>/, ""); sub("[\r\n]+", "\n"); $0 = "[ul]\n[li]" $0; skipblanklines=0; }
/<\/div>/ { gsub("[\r\n]+$", "\n", out); gsub(/<\/div>/, "[/ul]"); sub("[\r\n]+", "\n"); skipblanklines=1; }

/<code>/ || /<tt>/ { gsub(/(<code>|<tt>)/, TT); }
/<\/code>/ || /<\/tt>/ { gsub(/(<\/code>|<\/tt>)/, _TT); }



 {
  out = out "\n" $0
  next;
}

END {
  print outidx;
  print "";
  print out;
}