User:Mikk/Scripts
The following is a collection of Mikk's scripts for managing various things around the wiki. Mostly, but not only, relating to Interface Customization.
Nearly all of the scripts require GNU AWK 3.1. A standalone win32 "gawk.exe" is available in the unxutils package. As of the writing of this page, gawk 3.1 is in UnxUpdates.zip.
(I tend to use AWK simply because of this single exe download for Win32)
Converting scripts to wikitext
Heh, I needed a script to show my scripts =)
# Usage: gawk -f wikifycode.awk input.txt > output.txt BEGIN { print "<div style=\"max-width: 80em; margin-right: 2em; height: 20em; overflow: scroll;\">"; } { gsub(/&/, "\\&"); gsub(/</, "\\<"); gsub(/>/, "\\>"); gsub(/{{/, "{\\{"); gsub(/''/, "'\\'"); gsub(/\[\[/, "[\\["); gsub(/http:/, "http\\:"); gsub("__", "_\\_"); gsub(/USERNAME *= *".*"/, "USERNAME = \"''YOURUSERNAME''\""); gsub(/PASSWORD *= *".*"/, "PASSWORD = \"''YOURPASSWORD''\""); print " " gensub(/^( *)(\t)/, "\\1 ", "g", $0); } END { print "</div>"; }
Create Global Function List
In-game scanner
Install as an addon, start WoW, copy-and-paste resulting text to "funcscan.txt".
Interface\Addons\!!!GlobFuncScan\!!!GlobFuncScan.toc
## Interface: 20300 ## Title: Global Function Scanner ## Notes: Find all global WoW functions ## Author: Mikk ## SavedVariables: GlobFuncScan.lua
Interface\Addons\!!!GlobFuncScan\GlobFuncScan.lua
--[[ Global Function Scanner addon by Mikk See http://www.wowwiki.com/User:Mikk/Scripts Up to date for WoW 2.4. Produces some extras but that gets filtered out by later scripts. ]] if(not GlobFuncEdit) then GlobFuncEdit = CreateFrame("Editbox"); end GlobFuncEdit:SetFontObject(GameFontHighlightSmall); GlobFuncEdit:SetPoint("TOPRIGHT", UIParent, "TOPRIGHT", -10, -10); GlobFuncEdit:SetPoint("TOPLEFT", UIParent, "TOPRIGHT", -250, -10); GlobFuncEdit:SetHeight("500"); GlobFuncEdit:SetMultiLine(true); GlobFuncEdit:SetScript("OnEscapePressed", function() this:Hide(); end); local function funcaddr(func) return tonumber(strsub(tostring(func), 10), 16) end local refpoint={} local function point(funcname) refpoint[funcname]=funcaddr(_G[funcname]) end point("DeclineGroup") point("FlagTutorial") point("ConvertToRaid") point("FlagTutorial") point("ShowLFG") point("asin") point("pairs") point("AcceptQuest") local res = {} for k,v in pairs(_G) do if type(v)=="function" --[[ and strfind(k, "^_*[A-Za-z0-9]+$") or ]] then local addr = funcaddr(v) for _,refaddr in pairs(refpoint) do if abs(addr-refaddr)<300000 then tinsert(res, k); break end end end end table.sort(res); for k,v in pairs(refpoint) do table.insert(res, 1, format("# %-15s %10u (0x%08x)", k, v, v)); end table.insert(res, "# END") local str = table.concat(res, "\n"); DEFAULT_CHAT_FRAME:AddMessage("GlobFuncScan: Found " .. #res .. " functions. Total output length is " .. strlen(str) .. " bytes."); GlobFuncEdit:SetText(str); GlobFuncEdit:Show(); GlobFuncEdit:SetFocus(true); GlobFuncEdit:HighlightText(0, 999999);
List creator
Runs outside of WoW. You need a stand-alone Lua parser.
funcscan.lua
wowexe = "C:/program files/world of warcraft/wow.exe"; funcscan = "funcscan.txt"; skip={ ["message"] = true, ["GetText"] = true, } forcedfuncs={ ["SortLFG"]=true, -- this doesnt get picked up by the exe scanner like it should, it says "@ASortLFG". doh. } funcs={} f = assert(io.open(funcscan, "rt"), "could not open '" .. funcscan .. "' for read"); for str in f:lines() do if not string.match(str, "^#") then table.insert(funcs, str); end end f:close(); table.sort(funcs); f = assert(io.open(wowexe, "rb"), "could not open '" .. wowexe .. "' for read"); wow = f:read("*a"); wowstrings = {} for str in string.gmatch(wow, "[A-Za-z0-9_][A-Za-z0-9_][A-Za-z0-9_]+") do wowstrings[str] = true; end for _,str in ipairs(funcs) do if (wowstrings[str] and not skip[str]) or forcedfuncs[str] then print("* [[API "..str.."|"..str.."]]"); else io.stderr:write("skipping ",str,"\n") end end
Run as lua funcscan.lua > globfunc.txt
Manual work
- Run the boldizer (below) on the result
- Paste into Global Function List
- Manually list new/removed functions in the Changes section at the top of the page (hint: Use the "Show Changes" button!)
Boldizing Global Function List entries
Copy wikitext contents to files:
- Lua functions → luafuncs.txt
- World of Warcraft API → wowapi.txt
- Global Function List → globfunc.txt
(The only important thing is that the global function list file has "glob" somewhere in its name, and that it is the last file in the list)
# # Usage: gawk -f boldizeglobfuncs.awk luafuncs.txt wowapi.txt globfunc.txt > globfunc-fixed.txt # # Will correctly boldize Global Function List entries that do not occur in earlier files # Assumes that the global functions list file has "glob" somewhere in its name (and others do not) # FILENAME!=LastFileName { LastFileName = FILENAME; IsGlobals = ( tolower(FILENAME) ~ /glob/ ); } !IsGlobals && match($0, /\[\[(API[_ ][A-Za-z0-9_ ]+)/, a) { # Remember that we have seen this function gsub("_", " ", a[1]); api[a[1]]=1; } IsGlobals { # Fix dates occuring in the global function list if(/'' *Functions in bold are not .* as of .*''/) { gsub(/ as of .*''/, strftime(" as of %d %B %Y''")); } # Boldize (or not) depending on whether we have seen the API in use if(match($0, /\[\[(API[_ ][A-Za-z0-9_ ]+)/, a)) { gsub("_", " ", a[1]); if(!api[a[1]]) $0 = gensub(/(''')?\[\[(API[_ ].*)\]\](''')?/, "'''[[\\2]]'''", 1); else $0 = gensub(/(''')?\[\[(API[_ ].*)\]\](''')?/, "[[\\2]]", 1); } print; }
Finding non-existant APIs
This can be used to find non-existant (typoed or removed) APIs in e.g. World of Warcraft API, Widget API or Lua functions.
Copy wikitext contents to files:
- Global Function List → globfunc.txt
- Whatever page you want to test → somefile.txt
(The only important thing is that the global function list file has "glob" somewhere in its name, and that it is the first file in the list)
# # Usage: gawk -f findbadfuncs.awk globfunc.txt somefile.txt # # Finds functions that are NOT mentioned in the Global Function List # Will also complain about: # - FrameXML Object methods # - FrameXML Variables # Just ignore those for now, there are not too many of them (yet) # FILENAME!=LastFileName { LastFileName = FILENAME; IsGlobals = ( tolower(FILENAME) ~ /glob/ ); } IsGlobals && match($0, /\[\[(API[_ ][A-Za-z0-9_ ]+)/, a) { gsub("_", " ", a[1]); api[a[1]]=1; } !IsGlobals && match($0, /\[\[(API[_ ][A-Za-z0-9_ ]+)/, a) { gsub("_", " ", a[1]); if(!api[a[1]]) print; }
Mismatching descriptions of duplicate API entries
This script finds duplicate API entries, but where the description or argument list differs. You can of course give it multiple files and it will detect across files.
Copy wikitext contents to files:
- Whatever page you want to test → somefile.txt
# Usage: gawk -f findduplicates.awk somefile.txt # # Will output duplicate API links where the text does NOT match, # using a formatting that becomes clickable in most IDEs. # (Visual Studio will probably want "filename.ext(123):" though. Change it below.) # $0!~/<!--/ && match($0, /^ *[:*] *\w* *\[\[(API[_ ][A-Za-z0-9_ ]+)/, a) { gsub("_", " ", a[1]); posstr=sprintf("%s:%4u: ", FILENAME, NR); # M$: posstr=sprintf("%s(%4u): ", FILENAME, NR); if(pos[a[1]]) { if(line[a[1]]==$0) { equal++; next; # Comment out this line of you want to see equal dups too } else nonequal++; print posstr $0 "\n" pos[a[1]] line[a[1]] } else { pos[a[1]]=posstr; line[a[1]]=$0; } } END { print "" print "Total: " print " " equal " equal duplicates." print " " nonequal " nonequal duplicates." }
Village pump summary to Project:Community portal
This will only work under Unix (Linux). It requires that curl is installed.
#!/usr/bin/gawk -f # Goddamn, why did I get the idea to write this in Awk. It is SO not up to the task. Should have used php or perl instead. --Mikk BEGIN { MAXUSERS = 4; MAXSUBJS = 5; USERNAME = "YOURUSERNAME"; PASSWORD = "YOURPASSWORD"; } # Grmbl. Imperial date crap. This would have been a 1-line string comparison if it was iso dates. function monthnum(name) { if(name ~ /[Jj]an/) return 1; if(name ~ /[Ff]eb/) return 2; if(name ~ /[Mm]ar/) return 3; if(name ~ /[Aa]pr/) return 4; if(name ~ /[Mm]ay/) return 5; if(name ~ /[Jj]un/) return 6; if(name ~ /[Jj]ul/) return 7; if(name ~ /[Aa]ug/) return 8; if(name ~ /[Ss]ep/) return 9; if(name ~ /[Oo]ct/) return 10; if(name ~ /[Nn]ov/) return 11; if(name ~ /[Dd]ec/) return 12; return 0; } # Wants "12 May 2006", "9 Jun 2006" function dateval(datestr, t) { split(datestr, t, " "); return (int(t[3])*10000)+(monthnum(t[2])*100)+int(t[1]) } function storesubj(subj, users) { if(subj=="") return; # Sort and count users asort(users); n=0; for(u in users) n++; # Extract last MAXUSERS as a single line a[1]=0; userstr = ""; i=n+1-MAXUSERS; if(i<1) i=1; for( ; i<=n ; i++) { split(users[i], a, SUBSEP); ################# THIS IS THE FORMAT OF USER RECORDS #################### if(userstr!="") userstr=userstr " · "; userstr = userstr a[2] " " a[3]; } # Store subjs[subj] = a[1] SUBSEP subj SUBSEP userstr; } BEGIN { IGNORECASE=1; cmd="curl --max-time 180 --header 'Expect:' http://www.wowwiki.com/Special:Export/WoWWiki_talk:Village_pump"; while((cmd | getline $0)>0) { if($0 ~ /<text /) cap=1; if(!cap) continue; if(/<\/text>/) cap=0; gsub("<", "<"); gsub(">", ">"); gsub(""", "\""); gsub("&", "\\&"); if(match($0, /^=+ *([^=]+) *=+/, a)) { newsubj=a[1]; storesubj(subj, users); subj=newsubj; delete users; } if(match($0, /(\[\[user:[^\]]+\]\]) .* ([0-9][0-9]? \w+ 20[0-9][0-9])/, a) && subj!="") { if(dateval(a[2]) > int(users[a[1]])) users[a[1]] = dateval(a[2]) SUBSEP a[1] SUBSEP a[2]; } if(!cap) break; } close(cmd); storesubj(subj, users); # Now, extract the latest MAXSUBJS subjects and output asort(subjs); n=0; for(s in subjs) n++; out = ""; i=n+1-MAXSUBJS; if(i<1) i=1; for( ; i<=n ; i++) { split(subjs[i], a, SUBSEP); ################## THIS IS THE FORMAT OF SUBJECT RECORDS ################ out = out "* <b>" a[2] "</b>\n"; out = out ":" a[3] "\n"; out = out "\n"; } # Final touches to the output print "<!-- This is automatically generated. Editing it might be pointless. -->" > "tmp"; print out > "tmp"; close("tmp"); # Login print "\n\n\n----- LOGGING IN -----\n"; cmd="curl --max-time 180 --header 'Expect:' --location 'http://www.wowwiki.com/index.php?title=Special:Userlogin&action=submitlogin' --form 'wpName=" USERNAME "' --form 'wpPassword=" PASSWORD "' --form 'wpLoginattempt=Log in' --cookie-jar cookiejar | grep -A 15 '<body'"; print "" | cmd; close(cmd); # Grab hold of the wpEditToken print "\n\n\n----- GETTING wpEditToken -----\n"; cmd="curl --max-time 180 --header 'Expect:' 'http://www.wowwiki.com/index.php?title=WoWWiki_talk:Village_pump/Summary&action=edit' --cookie cookiejar"; while((cmd | getline $0)>0) { if(match($0, /value="([0-9a-f\\]+)" name="wpEditToken"/, a)) wpEditToken = a[1]; if(match($0, /value="([0-9a-f]+)" name="wpEdittime"/, a)) wpEdittime = a[1]; if(match($0, /value="([0-9a-f]+)" name="wpStarttime"/, a)) wpStarttime = a[1]; } close(cmd); if(wpEditToken=="") { print "ERROR: wowwiki wouldn't give me its wpEditToken!"; exit(1); } # Post the page print "\n\n\n----- POSTING CHANGES -----\n"; cmd="curl --max-time 180 --header 'Expect:' 'http://www.wowwiki.com/index.php?title=WoWWiki_talk:Village_pump/Summary&action=submit' --cookie cookiejar --form 'wpTextbox1=<tmp' --form 'wpSummary=automated upload' --form 'wpMinoredit=0' --form 'wpSave=Save page' --form 'wpSection=' --form 'wpEdittime="wpEdittime"' --form 'wpStarttime="wpStarttime"' --form 'wpEditToken="wpEditToken"'"; print cmd; print "" | cmd; close(cmd); # Force the Community portal to refresh print "\n\n\n----- PURGING COMMUNITY PORTAL -----\n"; cmd="curl --max-time 180 --header 'Expect:' 'http://www.wowwiki.com/index.php?title=WoWWiki:Community_portal&action=purge' | grep -A 50 'Recent talk in the'"; print "" | cmd; close(cmd); }
Events/A et al to Events/Action Bar et al and Events/Names
Automatically generates events-by-category pages and the simple name list from the alphabetically indexed pages (the originals). Is run a couple of times daily.
This will only work under Unix (Linux). It requires that curl is installed.
#!/usr/bin/gawk -f BEGIN { USERNAME = "YOURUSERNAME"; PASSWORD = "YOURPASSWORD"; KnownCat["Action Bar"] = 1; KnownCat["Auction"] = 1; KnownCat["Bank"] = 1; KnownCat["Battleground"] = 1; KnownCat["Buff"] = 1; KnownCat["Combat"] = 1; KnownCat["Communication"] = 1; KnownCat["Death"] = 1; KnownCat["GlueXML"] = 1; KnownCat["Guild"] = 1; KnownCat["Honor"] = 1; KnownCat["Instance"] = 1; KnownCat["Item"] = 1; KnownCat["Loot"] = 1; KnownCat["Mail"] = 1; KnownCat["Map"] = 1; KnownCat["Misc"] = 1; KnownCat["Movement"] = 1; KnownCat["Party"] = 1; KnownCat["Pet"] = 1; KnownCat["Player"] = 1; KnownCat["Quest"] = 1; KnownCat["Skill"] = 1; KnownCat["Spell"] = 1; KnownCat["System"] = 1; KnownCat["Tooltip"] = 1; KnownCat["Trade"] = 1; KnownCat["Tradeskill"] = 1; KnownCat["Trainer"] = 1; KnownCat["Unit Info"] = 1; curl="curl --max-time 180 --silent --show-error --cookie-jar cookiejar --cookie cookiejar --header 'Expect:' "; RS = "\r?\n"; } # # TitleEncode() - replace a few strategic characters that are likely to show up and cause problems. not a full url encoder. # function TitleEncode(page) { gsub(" ", "_", page); gsub("&", "%26", page); gsub("?", "%3f", page); gsub(/\(/, "%28", page); gsub(/\)/, "%29", page); return page; } # # GetPage() - get the full contents of a page via Special:Export # function GetPage(page, cmd,cap,ret) { page = TitleEncode(page); ret="" cmd=curl "'http://www.wowwiki.com/Special:Export/" page "'"; print cmd; while((cmd | getline $0)>0) { if($0 ~ /<text /) { gsub(/.*<text [^>]*>/, ""); cap=1; } if(!cap) continue; if(/<\/text>/) { gsub(/<\/text>.*/, ""); ret = ret $0; break; } ret = ret $0 "\n"; } close(cmd); gsub(">", ">", ret); gsub("<", "<", ret); gsub(""", "\"", ret); gsub("&", "\\&", ret); return ret; } # # Login() Log in. Will exit(0) on failure. # function Login() { headers = "headers.tmp"; print "" > headers; close(headers); cmd=curl " --location 'http://www.wowwiki.com/index.php?title=Special:Userlogin&action=submitlogin' --form 'wpName=" USERNAME "' --form 'wpPassword=" PASSWORD "' --form 'wpLoginattempt=Log in' --dump-header " headers " > /dev/null"; print cmd; print "" | cmd; close(cmd); while((getline < headers)>0) { if(/^Set-Cookie: wowwikiUserID=[0-9]/) { close(headers) return; # success! } } print "ERROR: Login failure!"; exit(1); } # # PutPage() - push a new revision of the given page # function PutPage(page, content) { page = TitleEncode(page); # Grab hold of the wpEditToken cmd=curl " 'http://www.wowwiki.com/index.php?title=" page "&action=edit'"; print cmd; while((cmd | getline $0)>0) { if(match($0, /value="([0-9a-f\\]+)" name="wpEditToken"/, a)) wpEditToken = a[1]; if(match($0, /value="([0-9a-f]+)" name="wpEdittime"/, a)) wpEdittime = a[1]; if(match($0, /value="([0-9a-f]+)" name="wpStarttime"/, a)) wpStarttime = a[1]; } close(cmd); if(wpEditToken=="") { print "ERROR: wowwiki wouldn't give me its wpEditToken!"; exit(1); } # Post the page tmpfile = "post.tmp"; print content > tmpfile; close(tmpfile); cmd=curl " 'http://www.wowwiki.com/index.php?title=" page "&action=submit' --form 'wpTextbox1=<"tmpfile"' --form 'wpSummary=automated upload' --form 'wpMinoredit=0' --form 'wpSave=Save page' --form 'wpSection=' --form 'wpEdittime="wpEdittime"' --form 'wpStarttime="wpStarttime"' --form 'wpEditToken="wpEditToken"'"; print cmd; print "" | cmd; close(cmd); } # # PurgePage() - do "action=purge" on a page to pull in changes in templates # function PurgePage(page) { page = TitleEncode(page); cmd=curl " 'http://www.wowwiki.com/index.php?title=" page "&action=purge' > /dev/null"; print cmd; print "" | cmd; close(cmd); } BEGIN { Login(); for(i=0x41;i<=0x5a;i++) { c=sprintf("%c",i); if(1) { txt = GetPage("Events/" c); } else { txt=""; while((getline < c)>0) txt = txt $0 "\n"; close(c); } # Trim out the inserted dummy headers that are only there to get [Edit] links at regular intervals gsub(/[ \n]*\|}[ \n]*===+ +===+[ \n]*{\|[ \n]*/, "", txt); # Split on "{{evt" to get one event per array entry n = split(txt, a, /{{evt/); AllEvents = AllEvents "\n== " c " ==\n" for(ei=1;ei<=n;ei++) { if(match(a[ei], /\|([A-Z_]+)\|([a-zA-Z_, ]+)}}( *\n)+/, parms)) { name = parms[1] header = substr(a[ei], 1, RSTART+RLENGTH); # print "Event: '" name "' Categories:'" parms[2] "'"; gsub(/^ */, "", parms[2]); gsub(/ *$/, "", parms[2]); split(parms[2], categories, / *, */); # for(ci in categories) print "C: '" categories[ci] "'" txt = substr(a[ei], RSTART+RLENGTH); gsub(/( *\n)*$/, "", txt); # print txt; AllEvents = AllEvents "\n* " name " → [[Events/" c "|" c "]] <small>" for(ci in categories) { cat = categories[ci]; if(!KnownCat[cat]) print "Warning: '" name "': Unknown category: '" cat "'"; else { CatPage[cat] = CatPage[cat] "{{evt|" name "|" parms[2] "}}\n\n" txt "\n\n\n" AllEvents = AllEvents "· [[Events/" cat "|" cat "]] "; } } AllEvents = AllEvents "</small>\n" } } # END for(ei=1;ei<n;ei++) } if(1) for(cat in CatPage) { PutPage("Events/" cat, "__NOTOC____NOEDITSECTION__{{eventlistheader}}\n" \ "<!-- DO NOT EDIT. THIS IS AN AUTOMATICALLY GENERATED PAGE. -->\n" \ "<!-- DO NOT EDIT. THIS IS AN AUTOMATICALLY GENERATED PAGE. -->\n" \ "<!-- DO NOT EDIT. THIS IS AN AUTOMATICALLY GENERATED PAGE. -->\n" \ "\n\n" \ ":{{icon-information}}Note that this page is automatically generated; editing it is pointless. To edit event descriptions, edit the entries in the alphabetical pages, e.g. [[Events/A]], [[Events/B]], etc. Changes there will be copied over to here within a few hours.\n" \ "\n\n" \ "== " cat " related events ==\n\n" \ CatPage[cat] ); } PutPage("Events/Names", "__NOEDITSECTION__{{tocright}}[[Category:API Events| ]]\n" \ "<!-- DO NOT EDIT. THIS IS AN AUTOMATICALLY GENERATED PAGE. -->\n" \ AllEvents ); }
RC bot
Takes Special:RecentChanges and reformats into WoWWiki:RC with WoWWiki:RC/Skip users removed, and also analyzes the diffs to try and find vandals / abusive text.
This will only work under Unix (Linux). It requires that curl is installed.
#!/usr/bin/gawk -f BEGIN { USERNAME = "YOURUSERNAME"; PASSWORD = "YOURPASSWORD"; curl="curl --max-time 180 --silent --show-error --cookie-jar cookiejar --cookie cookiejar --header 'Expect:' "; RS = "\r?\n"; } # # TitleEncode() - replace a few strategic characters that are likely to show up and cause problems. not a full url encoder. # function TitleEncode(page) { gsub(" ", "_", page); gsub("&", "%26", page); gsub("?", "%3f", page); gsub(/\(/, "%28", page); gsub(/\)/, "%29", page); return page; } # # GetPage() - get the full contents of a page via Special:Export # function GetPage(page, cmd,cap,ret) { page = TitleEncode(page); ret="" cmd=curl "'http://www.wowwiki.com/Special:Export/" page "'"; print cmd; while((cmd | getline $0)>0) { if($0 ~ /<text /) { gsub(/.*<text [^>]*>/, ""); cap=1; } if(!cap) continue; if(/<\/text>/) { gsub(/<\/text>.*/, ""); ret = ret $0; break; } ret = ret $0 "\n"; } close(cmd); gsub(">", ">", ret); gsub("<", "<", ret); gsub(""", "\"", ret); gsub("&", "\\&", ret); return ret; } # # Login() Log in. Will exit(0) on failure. # function Login() { headers = "headers.tmp"; print "" > headers; close(headers); cmd=curl " --location 'http://www.wowwiki.com/index.php?title=Special:Userlogin&action=submitlogin' --form 'wpName=" USERNAME "' --form 'wpPassword=" PASSWORD "' --form 'wpLoginattempt=Log in' --dump-header " headers " > /dev/null"; print cmd; print "" | cmd; close(cmd); while((getline < headers)>0) { if(/^Set-Cookie: wowwikiUserID=[0-9]/) { close(headers) return; # success! } } print "ERROR: Login failure!"; exit(1); } # # PutPage() - push a new revision of the given page # function PutPage(page, content) { page = TitleEncode(page); # Grab hold of the wpEditToken cmd=curl " 'http://www.wowwiki.com/index.php?title=" page "&action=edit'"; print cmd; while((cmd | getline $0)>0) { if(match($0, /value="([0-9a-f\\]+)" name="wpEditToken"/, a)) wpEditToken = a[1]; if(match($0, /value="([0-9a-f]+)" name="wpEdittime"/, a)) wpEdittime = a[1]; if(match($0, /value="([0-9a-f]+)" name="wpStarttime"/, a)) wpStarttime = a[1]; } close(cmd); if(wpEditToken=="") { print "ERROR: wowwiki wouldn't give me its wpEditToken!"; exit(1); } # Post the page tmpfile = "post.tmp"; print content > tmpfile; close(tmpfile); cmd=curl " 'http://www.wowwiki.com/index.php?title=" page "&action=submit' --form 'wpTextbox1=<"tmpfile"' --form 'wpSummary=automated upload' --form 'wpMinoredit=0' --form 'wpSave=Save page' --form 'wpSection=' --form 'wpEdittime="wpEdittime"' --form 'wpStarttime="wpStarttime"' --form 'wpEditToken="wpEditToken"'"; print cmd; print "" | cmd; close(cmd); } # # PurgePage() - do "action=purge" on a page to pull in changes in templates # function PurgePage(page) { page = TitleEncode(page); cmd=curl " 'http://www.wowwiki.com/index.php?title=" page "&action=purge' > /dev/null"; print cmd; print "" | cmd; close(cmd); } # # GetRCSummary() - retreive a short summary of changes # function GetRCSummary(rcid, diffurl) { if(!RCs[rcid]) { dellines=0; lol=0; lolol=0; url=0; tinyurl=0; gold=0; powerlevel=0; ninja=0; pwn=0; noob=0; insult=0; fuck=0; penis=0; pussy=0; ass=0; blows=0; gay=0; gsub(/&/, "\\&", diffurl); gsub(/'/, "%27", diffurl); gsub(/"/, "%22", diffurl); gsub(/\{/, "%7b", diffurl); gsub(/\|/, "%7c", diffurl); gsub(/\}/, "%7d", diffurl); rccmd=curl " '" diffurl "'"; # print rccmd; bOK=0 while((rccmd | getline )>0) { bOK=1 if(/<td class='diff-deletedline'>/ && /<td class='diff-addedline'><\/td>/) dellines++; if(match($0, /<td class='diff-addedline'>(.*)<\/td>/, a)) { $0=a[1]; # print "ADDED: \"" $0 "\"" oldIC = IGNORECASE; IGNORECASE = 1; if(/\yl+o+l+\y/) lol++; if(/lo+l+o+l/) lolol++; if(/la+w+l/) lolol++; if(/fuck/) fuck++; if(/tinyurl\.com/) tinyurl++; if(/http:\/\// && (! /thottbot.com/) && (! /allakhazam.com/) && (! /worldofwarcraft.com/) ) url++; if(/gold/) gold++; if(/power.?level/) powerlevel++; if(/ninja/) ninja++; if(/pwn/) pwn++; if(/n[o0][o0]+b/ || /\ynub/) noob++; if(/idiot/ || /moron/ || /bastard/ || /[a@][s$]+[- ]?h[o0]le/ || /schmuck/ || /\yween(ie|s)/) insult++; if(/f[ue]ck/ || /f[*]ck/ || /f00k/) fuck++; if(/\ydick/ || /p[e3]+n[i1l]+[s$]/ || /\yc[o0]+ck/) penis++; if(/pu[s$][s$]y/ || /pu[s$][s$][i1l][e3][s$]/ || /cu*nt/ || /c[*]nt/) pussy++; if(/\y[a@]ss\y/ || /\y[a@]rse\y/ || /\y[a@]sses\y/ || /[a@][s5][s5]h[a@]t/ || /[a@][s5][s5]c[l1][o0]wn/ || /[a@][s5][s5].?h[o0][l1i][e3]/) ass++; if(/blows/) blows++; if(/ga+y/ || /gh[3e]y/ || /g[3e]hy/) gay++; IGNORECASE = oldIC; } } close(rccmd); if(bOK) { txt=" "; if(dellines>5) txt=txt" · " dellines " deleted lines"; if(lol) txt=txt" · \"lol\" x " lol; if(lolol) txt=txt" · \"lolol\" etc x " lolol; if(url) txt=txt" · \"http://\" x " url; if(tinyurl) txt=txt" · \"tinyurl.com\" x " tinyurl; if(gold) txt=txt" · \"gold\" x " gold; if(powerlevel) txt=txt" · \"power level\" x " powerlevel; if(ninja) txt=txt" · \"ninja\" x " ninja; if(pwn) txt=txt" · \"pwn\" x " pwn; if(noob) txt=txt" · \"pwn\" x " noob; if(insult) txt=txt" · \"idiot\" etc x " insult; if(fuck) txt=txt" · \"fuck\" etc x " fuck; if(penis) txt=txt" · \"penis\" etc x " penis; if(pussy) txt=txt" · \"pussy\" etc x " pussy; if(ass) txt=txt" · \"ass\" etc x " ass; if(blows) txt=txt" · \"blows\" x " blows; if(gay) txt=txt" · \"gay\" etc x " gay; RCs[rcid] = txt; print rcid "\t'" txt "'"; print rcid "\t" txt >> "rcs.txt"; fflush("rcs.txt"); } else { return "(error getting diff)"; } } if(RCs[rcid]==" ") return ""; return RCs[rcid]; } BEGIN { # Read summaries of previously scanned recent changes rcfile="rcs.txt"; while((getline < rcfile)>0) { split($0, a, "\t"); RCs[a[1]] = a[2]; } close(rcfile); # Log in :-) Login(); # Get list of users to skip txt = GetPage("WoWWiki:RC/Skip"); n = split(txt, a, /\n/); for(i=1;i<=n;i++) { name = a[i]; gsub(/^\* +/, "", name); gsub(/ *$/, "", name); aSkip[name]=1; } res = "<noinclude><!-- DO NOT EDIT. THIS IS AN AUTOMATICALLY GENERATED PAGE. -->{{nocat}}</noinclude>\n\n"; ######## Retreive and parse the recent changes page rc_tmp = "rc.tmp"; # Retreive to a temp file or the operation times out while we're scanning changed pages system(curl " 'http://www.wowwiki.com/index.php?title=Special:Recentchanges&limit=1500&hidepatrolled=1&days=3' > " rc_tmp); while((getline < rc_tmp)>0) { time=0; if(match($0, /<h4>(.*)<\/h4>/, a)) res = res "\n\n=== " a[1] " ===\n"; if(match($0, /(<\/tt>|<strong>)<a href="\/([^"]*)"[^>]*>([^<]*)<\/a>/, a)) { pagelink=a[2]; pagetitle=a[3]; subitem=0; } else if(match($0, /<tt><a href="\/([^"]*)"[^>]*>([^<]*)<\/a>/, a)) { pagelink=a[1]; pagetitle=a[2]; time=" "; subitem=1; } else continue; if(subitem && bSkip) continue; pageurl=""; if(pagelink ~ /^index.php/) { pageurl="http://www.wowwiki.com/" pagelink page = "[" pageurl " " pagetitle "]"; } else { gsub(/^[Cc]ategory:/, ":Category:", pagelink); gsub(/^[I]mage:/, ":Image:", pagelink); page = "[[" pagelink "|" pagetitle "]]"; } if(match($0, /<a href="([^"]*)"[^>]*>diff<\/a>/, a)) diffurl="http://www.wowwiki.com" a[1]; else if(match($0, /<a href="([^"]*)"[^>]*>last<\/a>/, a)) diffurl="http://www.wowwiki.com" a[1]; else if(match($0, /<a href="([^"]*)"[^>]*>changes<\/a>/, a)) diffurl="http://www.wowwiki.com" a[1]; else diffurl=""; if(match($0, /<a href="([^"]*)"[^>]*>hist<\/a>/, a)) histurl="http://www.wowwiki.com" a[1]; else if(match($0, /<a href="([^"]*)"[^>]*>Page history<\/a>/, a)) histurl="http://www.wowwiki.com" a[1]; else histurl=""; if(match($0, /<a href="([^"]*)"[^>]*>cur<\/a>/, a)) curdiffurl="http://www.wowwiki.com" a[1]; else curdiffurl=""; if(!time) { if(!match($0, / ([0-9][0-9]:[0-9][0-9]:[0-9][0-9]) /,a)) continue; time="<tt>" a[1] "</tt>" } user="" usertxt=""; if(match($0, /\. \. <a [^>]*title="User:([^"]*)"/, a)) { user=a[1]; usertxt="[[User:" user "|" user "]]"; } else if(match($0, /<span class="changedby">\[(.*)\]<\/span>/, a)) { usertxt="[ <small>" gensub(/<a [^>]*title="([^"]*)"[^>]*>([^<]*)<\/a>/, "[[\\1|\\2]]", "g", a[1]) "</small> ]"; } # See if we only have skippable users in this bundle (which may not be a bundle; it can be a single entry, but that doesn't matter) if(!subitem) { bSkip = 1; split(usertxt, a, /\[\[/); for(i in a) { if(match(a[i], /User:(.*)\|/, aa)) { if(!aSkip[aa[1]]) { bSkip = 0; } } } } if(bSkip) continue; if(user) print "User: " user " Title: " pagetitle; else print "----------- Title: " pagetitle; if(match($0, /<span class='comment'>\((.*)\)<\/span>/, a)) { comment=a[1]; gsub(/<a [^>]*>/, "", comment); gsub(/<\/a>/, "", comment); gsub(/{/, "\\{", comment); gsub(/\[\[Category:/, "[[:Category:", comment); comment=" (" comment ")" } else comment=""; # Get recent change summary (cached or new). This mucks up $0 so keep it last. rcurl = diffurl; if(pageurl ~ /rcid=/) rcurl = pageurl; if(match(rcurl, "rcid=([0-9]+)", a)) { rctxt = GetRCSummary(a[1], rcurl); if(rctxt!="") rctxt = " <b style=\"border-bottom: 1px dotted;\">" rctxt "</b>"; } else { rctxt = ""; } if(subitem) { res = res ":::* " page " . . <small>[" diffurl " diff] . [" curdiffurl " cur]</small> . . " usertxt " <small>" comment "</small>" rctxt "\n" } else res = res "* " time " " page " . . <small>[" diffurl " diff] . [" histurl " hist]</small> . . " usertxt " <small>" comment "</small>" rctxt "\n" fflush(); if(length(res)>=256000) { res = res "\n''RC output length exceeds 256 KB. Ignoring older entries. To see older entries, mark some of the above changes as patrolled and older ones will start appearing on the next update.\n"; break; } } close(rc_tmp); PutPage("WoWWiki:RC/Content", res); PurgePage("WoWWiki:RC"); }
Forumizing wikitext
Rich formatting
# Usage: gawk -f forumize.awk wikitext.txt > forum.txt # # Will attempt to convert wikitext to text suitable for pasting in a forum # Mostly tested with http://www.wowwiki.com/UI_FAQ to http://wowinterface.com # BEGIN { RS="\r?\n"; sectidx=0; BIGBIG = "[SIZE=4]" _BIGBIG = "[/SIZE]" BIG = "[SIZE=3]" _BIG = "[/SIZE]" TT = "[FONT=Courier New][COLOR=LightBlue]" _TT = "[/COLOR][/FONT]" } /^ *$/ && skipblanklines { next } skipblanklines { skipblanklines = 0 } match($0, /^(=+) *([^=]+) *=+(.*)/, a) { if(a[1]=="=") { sectidx = sectidx + 1; outidx = outidx "[b]" sectidx ". " a[2] "[/b]\n"; $0 = "" BIGBIG "[u][b]" sectidx ". " a[2] "[/b][/u]" _BIGBIG a[3]; } else if(a[1]=="==") { outidx = outidx "... - " a[2] "\n"; $0 = "" BIG "[u][b]" a[2] "[/b][/u]" _BIGBIG a[3]; } else $0 = "[u][b]" a[2] "[/b][/u]" a[3]; } /'''/ { while(/'''/) { sub(/'''/, "[b]"); if(!sub(/'''/, "[/b]")) $0 = $0 "[/b]"; } } /''/ { while(/''/) { sub(/''/, "[i]"); if(!sub(/''/, "[/i]")) $0 = $0 "[/i]"; } } /^{{faqq}}/ { sub(/^{{faqq}} */, "[b]" BIG "[COLOR=Orange]Q:[/COLOR]" _BIG " "); $0 = $0 "[/b]"; } /^{{faqa}}/ { sub(/^{{faqa}} */, "[b]" BIG "A:" _BIG "[/b] "); } /^:/ { sub(/^::/, " "); sub(/^:/, " "); } /^;.*:/ { $0=gensub(/^;([^:]+):(.*)/, "[b]\\1[/b]\n\\2", "1", $0); } /^;/ { $0=gensub(/^;([^:]+)/, "[b]\\1[/b]", "1", $0); } /^[^#]/ { count=0; } /^#/ { count=count+1; sub(/^# */, count ". "); } /\[http:\/\// { $0 = gensub(/\[(http:\/\/[^ ]+) ([^\]]+)\]/, "[URL=\\1]\\2[/URL]", "g", $0); } /{{/ { gsub(/{{[Ee]xample\/Begin}}/, "[quote]"); gsub(/{{[Ee]xample\/End}}/, "[/quote]"); $0 = gensub(/{{[Ff]aqcredit[|]([^}]+)}}/, "[i](credit: \\1)[/i]", "g", $0); } /\[\[#/ { $0 = gensub(/\[\[#[^|]+\|([^\]]+)\]\]/, "[i]\\1[/i]", "g", $0); } /<div/ { gsub("[\r\n]+$", "\n", out); gsub(/<div [^>]*>/, "[INDENT]"); sub("[\r\n]+", "\n"); skipblanklines=1; } /<\/div>/ { gsub("[\r\n]+$", "\n", out); gsub(/<\/div>/, "[/INDENT]"); sub("[\r\n]+", "\n"); skipblanklines=1; } /<code>/ || /<tt>/ { gsub(/(<code>|<tt>)/, TT); } /<\/code>/ || /<\/tt>/ { gsub(/(<\/code>|<\/tt>)/, _TT); } { out = out "\n" $0 next; } END { print outidx; print ""; print out; }
Basic formatting (e.g. blizzard forums)
# Usage: gawk -f forumize-basic.awk wikitext.txt > forum.txt # # Will attempt to convert wikitext to text suitable for pasting in a forum # with VERY basic markup, e.g. the blizzard WoW forums # BEGIN { RS="\r?\n"; sectidx=0; BIGBIG = "" _BIGBIG = "" BIG = "" _BIG = "" TT = "[u]" _TT = "[/u]" } /^ *$/ && skipblanklines { next } skipblanklines { skipblanklines = 0 } match($0, /^(=+) *([^=]+) *=+(.*)/, a) { if(a[1]=="=") { sectidx = sectidx + 1; outidx = outidx "[b]" sectidx ". " a[2] "[/b]\n"; $0 = "" BIGBIG "[u][b]" sectidx ". " a[2] "[/b][/u]" _BIGBIG a[3]; } else if(a[1]=="==") { outidx = outidx "... - " a[2] "\n"; $0 = "" BIG "[u][b]" a[2] "[/b][/u]" _BIGBIG a[3]; } else $0 = "[u][b]" a[2] "[/b][/u]" a[3]; } /'''/ { while(/'''/) { sub(/'''/, "[b]"); if(!sub(/'''/, "[/b]")) $0 = $0 "[/b]"; } } /''/ { while(/''/) { sub(/''/, "[i]"); if(!sub(/''/, "[/i]")) $0 = $0 "[/i]"; } } /^{{faqq}}/ { sub(/^{{faqq}} */, "[b]Q: "); $0 = $0 "[/b]"; } /^{{faqa}}/ { sub(/^{{faqa}} */, "[b]A: [/b]"); } /^:/ { sub(/^::/, " "); sub(/^:/, " "); } /^;.*:/ { $0=gensub(/^;([^:]+):(.*)/, "[b]\\1[/b]\n\\2", "1", $0); } /^;/ { $0=gensub(/^;([^:]+)/, "[b]\\1[/b]", "1", $0); } /^[^#]/ { count=0; } /^#/ { count=count+1; sub(/^# */, count ". "); } /\[http:\/\// { $0 = gensub(/\[(http:\/\/[^ ]+) ([^\]]+)\]/, "\\2 ( \\1 )", "g", $0); } /{{/ { gsub(/{{[Ee]xample\/Begin}}/, "[quote]"); gsub(/{{[Ee]xample\/End}}/, "[/quote]"); $0 = gensub(/{{[Ff]aqcredit[|]([^}]+)}}/, "[i](credit: \\1)[/i]", "g", $0); } /\[\[#/ { $0 = gensub(/\[\[#[^|]+\|([^\]]+)\]\]/, "[i]\\1[/i]", "g", $0); } /<div/ { gsub("[\r\n]+$", "\n", out); gsub(/<div [^>]*>/, ""); sub("[\r\n]+", "\n"); $0 = "[ul]\n[li]" $0; skipblanklines=0; } /<\/div>/ { gsub("[\r\n]+$", "\n", out); gsub(/<\/div>/, "[/ul]"); sub("[\r\n]+", "\n"); skipblanklines=1; } /<code>/ || /<tt>/ { gsub(/(<code>|<tt>)/, TT); } /<\/code>/ || /<\/tt>/ { gsub(/(<\/code>|<\/tt>)/, _TT); } { out = out "\n" $0 next; } END { print outidx; print ""; print out; }