Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/ClusterM/fceux.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorzeromus <zeromus@users.noreply.github.com>2018-04-09 01:35:15 +0300
committerzeromus <zeromus@users.noreply.github.com>2018-04-09 01:35:15 +0300
commit0307e3b827da185054e3251bb4165734b890cf59 (patch)
tree0dfa164135e09164ce3aa922e19e342a6abb82c0 /documentation
parent4d2e8eee53cca4c12b22e8d081fc7e47c231a488 (diff)
migration tidying (build master from trunk)
Diffstat (limited to 'documentation')
-rw-r--r--documentation/.gitignore17
-rw-r--r--documentation/TODO-PROJECT61
-rw-r--r--documentation/Videolog.txt48
-rw-r--r--documentation/cheat.html313
-rw-r--r--documentation/faq67
-rw-r--r--documentation/fceux-net-server.658
-rw-r--r--documentation/fceux.6326
-rw-r--r--documentation/fcs.txt153
-rw-r--r--documentation/fm2.txt79
-rw-r--r--documentation/porting.txt289
-rw-r--r--documentation/protocol.txt90
-rw-r--r--documentation/snes9x-lua.html417
-rw-r--r--documentation/tech/cpu/4017.txt97
-rw-r--r--documentation/tech/cpu/dmc.txt235
-rw-r--r--documentation/tech/cpu/nessound-4th.txt551
-rw-r--r--documentation/tech/cpu/nessound.txt697
-rw-r--r--documentation/tech/exp/mmc5-e.txt250
-rw-r--r--documentation/tech/exp/smb2j.txt112
-rw-r--r--documentation/tech/exp/tengen.txt18
-rw-r--r--documentation/tech/exp/vrcvi.txt388
-rw-r--r--documentation/tech/exp/vrcvii.txt321
-rw-r--r--documentation/tech/nsfspec.txt336
-rw-r--r--documentation/tech/ppu/2c02 technical operation.txt296
-rw-r--r--documentation/tech/ppu/loopy1.txt63
-rw-r--r--documentation/tech/ppu/loopy2.txt33
-rw-r--r--documentation/tech/readme.now6
-rw-r--r--documentation/tech/readme.sound2
-rw-r--r--documentation/todo70
28 files changed, 5393 insertions, 0 deletions
diff --git a/documentation/.gitignore b/documentation/.gitignore
new file mode 100644
index 00000000..674deb69
--- /dev/null
+++ b/documentation/.gitignore
@@ -0,0 +1,17 @@
+# A simulation of Subversion default ignores, generated by reposurgeon.
+*.o
+*.lo
+*.la
+*.al
+*.libs
+*.so
+*.so.[0-9]*
+*.a
+*.pyc
+*.pyo
+*.rej
+*~
+*.#*
+.*.swp
+.DS_store
+# Simulated Subversion default ignores end here
diff --git a/documentation/TODO-PROJECT b/documentation/TODO-PROJECT
new file mode 100644
index 00000000..6354d36d
--- /dev/null
+++ b/documentation/TODO-PROJECT
@@ -0,0 +1,61 @@
+Items to be completed before 2.0 release
+
+FASTAPASS / FP_FASTAPASS / Are these archaic? They suck - ??
+
+Separate frameskip/pause from EmulationPaused - zeromus
+
+Make ALL Debugging code conditional - zeromus
+
+Doxygen integration - DONE
+ * website integration
+
+Linux build - soules
+ * clean-up of input device code
+
+Windows build - zeromus
+ * verify rerecording
+ * verify debugging
+ * verify avi writing
+
+Commandline parsing - lukas
+ * verify in windows - zeromus
+
+Configfile parsing - lukas
+ * verify in windows - zeromus
+
+Do we really need vc7 project? Cah4e3? - zeromus
+Source code docs cleaning - zeromus
+Homepage docs - zeromus
+
+What is the proper gnu way to do CREDITS from the homepage? e.g. Marat Fayzullin - General NES information
+
+Freenode registration - lukas
+ * homepage update
+
+Default hotkey philosophy (there are different philosophies from XD and TAS) - ALL
+
+Netplay - FUTURE STUFF
+ Move server code into tree - DONE
+ Ensure netplay compiles
+ verify netplay
+
+----done-----
+
+Investigate OSX build [ http://www.lamer0.com/ ] - zeromus [posted on blog...]
+
+Move to scons - DONE
+
+Merge garnet and sf repos - DONE
+ * Ensure gnome frontend is in repo - DONE
+ * Move to sourceforge SVN - DONE
+
+--strategic--
+
+Politics:
+ * aboutbox credits
+ * AUTHORS
+
+Release:
+ * Press Release - ??
+ * Homepage updates
+ - eliminate old news, add new links
diff --git a/documentation/Videolog.txt b/documentation/Videolog.txt
new file mode 100644
index 00000000..15c349b1
--- /dev/null
+++ b/documentation/Videolog.txt
@@ -0,0 +1,48 @@
+Since SVN revision 931, FCEUX features a new option to create avi files from a recorded movie and it is relatively easy to use if you know the bare basics of mencoder.
+Call "scons CREATE_AVI=1" to activate it. You will, however, most likely need mencoder to use it.
+
+You get the raw video data via stdin and the audio data from a fifo file. Let's say you want the video to be in the best quality available, no matter how long it takes or how big the avi file might get. In order to get the NES's original video resolution and a good sound quality, you might need to set some settings beforehand or just pass them along while calling mencoder.
+
+
+Here's an example:
+./fceux \
+ --xscale 1 --yscale 1 --special 0 \
+ --pal 0 \
+ --sound 1 --soundq 1 --soundrate 48000 --mute 1 \
+ --nospritelim 1 \
+ --no-config 1 \
+ --videolog "mencoder - -o myfirstencodedrun.avi \
+ -ovc x264 -x264encopts qp=0 \
+ -oac pcm \
+ -noskip -nocache -mc 0 -aspect 4/3
+ NESVSETTINGS" \
+ --playmov mymovie.fm2 myROM.nes
+
+Now let's see what is done and why we did it:
+First of all, we started fceux with "./fceux" and gave it some options:
+ "--xscale" and "--yscale" determine how much bigger the video in comparison to its regular size. It's no point to use anything other than 1 here because you can always see your video on fullscreen or at least scale it, can't you? As a nice addon, it saves time to create the avi file and also saves valuable space on your hard disk.
+ "--special" would usually do something fancy to your picture when you're playing a ROM, but again, it's mostly pointless to use for an avi.
+ "--pal 0" lets the game run at ~60Hz. Set this so 1 if you are using a PAL ROM.
+ "--sound 1" activates sound.
+ "--soundq 1" activates high quality sound.
+ "--soundrate 48000" sets the sound rate to 48kHz.
+ "--mute 1" mutes FCEUX while still passing the sound to mencoder. This way you can capture a movie and still listen to your music without having the NES sounds in the background.
+ "--nospritelim" deactivates the NES's 8 sprites per scanlines limit.
+ "--no-config 1" is used not to destroy your settings when creating an avi movie.
+ "--videolog" calls mencoder:
+ "-" states that we're getting the video stream from stdin.
+ "-o" determines the name of the produced avi file.
+ "-ovc x264" sets the video codec to be x264 and is highly recommended for quality reasons. However, if you using a version of x264 from Sep 28th 2008 or newer, mplayer will not be able to decode this again properly. Until this is fixed this mplayer, you might want to replace "-ovc x264 -x264encopts qp=0" with "-ovc lavc -lavcopts vcodec=ffv1:format=bgr32:coder=1:vstrict=-1". Watch out, though, as this needs *way* more space than x264 does.
+ "-x264encopts qp=0" tells the x264 codec to use a quantizer of 0, which results in lossless video quality.
+ "-oac pcm" saves the audio data uncompressed (this might turn out really big).
+ "-noskip" makes sure that no frame is dropped during capturing.
+ "-nocache" is responsible for immediate encoding and not using any cache.
+ "-mc 0" makes sure that the sound does not go out of sync.
+ "-aspect 4/3" sets the avi's aspect ratio so you can see it in fullscreen and have no borders to the left and right.
+ "NESVSETTINGS" takes care of proper recognition of the audio and video data from FCEUX.
+ "&> mencoder.log" lets mencoders log its output into a file called "mencoder.log" in your current working directory.
+ "--playmov" reads which movie file we want to load (here it's mymovie.fm2).
+ Lastly, we load our desired ROM (in this case it's "myROM.nes").
+
+To go for faster encoding and thus less quality, change "-ovc x264 -x264encopts qp=0" to "-ovc xvid -xvidencopts bitrate=200" and "-oac pcm" to "-oac mp3lame -lameopts mode=3:preset=60" to create a 200 kbps xvid video with 60 kbps of mono mp3 audio.
+Good luck! :)
diff --git a/documentation/cheat.html b/documentation/cheat.html
new file mode 100644
index 00000000..a0eac597
--- /dev/null
+++ b/documentation/cheat.html
@@ -0,0 +1,313 @@
+<html>
+ <head>
+ <title>FCE Ultra Cheat Guide</title>
+ </head>
+ <body>
+ <center><h1>FCE Ultra Cheat Guide</h1></center>
+ <center><i>Last updated November 12, 2003<br />Valid as of FCE Ultra 0.97.4</i></center>
+<p>
+ <b>Table of Contents:</b>
+ <ul>
+ <li><a href="#intro">Introduction</a>
+ <ul>
+ <li><a href="#cheatfiles">Cheat Files</a>
+ </ul>
+ <li><a href="#windows">The Windows Interface</a>
+ <ul>
+ <li><a href="#windows-search">Cheat Search Interface</a>
+ </ul>
+ <li><ba href="#text">The Text Interface(TODO)</ba>
+ <li><a href="#examples">Examples</a>
+ <ul>
+ <li><a href="#examples-mm3">"Mega Man 3" Windows Example</a>
+ <li><a href="#examples-oh">"Over Horizon" Text Interface Example</a>
+ </ul>
+ <li><a href="#tips">Tips</a>
+</ul>
+<hr width="100%">
+<a name="tips"><h2>Introduction</h2></a>
+<p>
+ FCE Ultra allows cheating by the periodic "patching" of arbitrary addresses
+ in the 6502's memory space with arbitrary values, as well as read substitution.
+ "Read substitution" is the method that would be used on a real NES/Famicom,
+ such as done by the Game Genie and Pro Action Replay. It is required
+ to support GG and PAR codes, but since it is relatively slow when done
+ in emulation, it is not the preferred method when a RAM patch will
+ suffice. Also, in FCE Ultra, read substitution will not work properly with
+ zero-page addressing modes(instructions that operate on RAM at $0000 through
+ $00FF).
+</p>
+<p>
+ The RAM patches are all applied a short time before the emulated
+ vertical blanking period. This detail shouldn't concern most people, though.
+ However, this does mean that cheating with games that use
+ bank-switched RAM may be problematic. Fortunately, such games are not very
+ common(in relation to the total number of NES and Famicom games).
+</p>
+<a name="cheatfiles"><h3>Cheat Files</h3></a>
+<p>
+ Cheats are stored in the "cheats" subdirectory under the base FCE Ultra
+ directory. The files are in a simple plain-text format. Each line represents
+ a one-byte memory patch. The format is as follows(text in brackets []
+ represents optional parameters):
+</p>
+<p>
+<blockquote>
+ [S][C][:]Address(hex):Value(hex):[Compare value:]Description
+</blockquote>
+ Example:
+
+<blockquote> 040e:05:Infinite super power.</blockquote>
+</p>
+<p>
+ A colon(:) near the beginning of the line is used to disable the cheat.
+ "S" denotes a cheat that is a read-substitute-style cheat(such as with Game
+ Genie cheats), and a "C" denotes that the cheat has a compare value.
+ </p>
+
+<hr width="100%">
+<a name="windows"><h2>The Windows Interface</h2></a>
+<p>
+ All addresses listed in the cheats window are in unsigned
+ 16-bit hexadecimal format and all values in these windows are in an
+ unsigned 8-bit decimal format(the range for values is 0 through 255).
+</p>
+<p>
+ The cheats window contains the list of cheats for the currently loaded game
+ on the right side. Existing cheats can be selected, edited, and updated
+ using the "Update" button.
+</p>
+<a name="windows-search"><h3>Cheat Search Interface</h2></a>
+<p>
+ The cheat search interface consists of several components: a list of
+ addresses and associated data for a search, several command buttons,
+ and the search parameters.
+</p>
+<p>
+ Each entry in the list is in the format of:
+ <blockquote>Address:Original Value:Current Value</blockquote>
+</p>
+<p>
+ The address is the location in the 6502's address space, the original
+ value is the value that was stored at this address when the search was
+ reset, and the current value is the value that is currently stored at
+ that address. Selecting an item in this list will automatically cause
+ the "Address" field in the cheat information box on the right side of the
+ window to be updated with the selected address.
+</p>
+<p>
+ The "Reset Search" button resets the search process; all valid addresses
+ are displayed in the cheat list and the data values at those addresses noted.
+</p>
+<p>
+ The "Do Search" buttons performs a search based on the search parameters
+ and removes any non-matching addresses from the address list.
+</p>
+<p>
+ The "Set Original to Current" button sets the remembered original values
+ to the current values. It is like the "Reset Search" button, but it does
+ not affect which addresses are shown in the address list. This command is
+ especially useful when used in conjunction with the "O!=C" search filter.
+</p>
+<p>
+ The "Unhide Excluded" button shows all addresses that are excluded as a
+ result of any previous searches. It is like the "Reset Search" button
+ except that it does not affect the remembered original values.
+</p>
+<p>
+ The numbers assigned the names "V1" and "V2" have different meanings based
+ on which filter is selected. A list of the names of the filters and detailed
+ information on what they do follows("original value" corresponds to the value
+ remembered for a given addres and "current value" is the value currently
+ at that address. Also, if a value is not explicitly said to be shown
+ under a certain condition, then it is obviously excluded.):
+<p>
+ "O==V1 && C==V2":
+<blockquote>
+ Show the address if the original value is equal to "V1" AND
+ the current value is equal to "V2".
+</blockquote>
+</p>
+<p>
+ "O==V1 && |O-C|==V2":
+<blockquote>
+ Show the address if the original value is equal to "V1" AND
+ the difference between the current value and the original
+ value is equal to "V2".
+</blockquote>
+</p>
+<p>
+ "|O-C|==V2":
+<blockquote>
+ Show the address if the difference between the current value
+ and the original value is equal to "V2".
+</blockquote>
+</p>
+<p>
+ "O!=C":
+<blockquote>
+ Show the address if the original value does not equal the
+ current value.
+</blockquote>
+</p>
+<p>
+ The following cheat methods/filters automatically perform the function
+ of the "Set Original to Current" button after "Do Search" is pressed.
+</p>
+<p>
+ "Value decreased."
+<blockquote>
+ Show the address if the value has decreased.
+</blockquote>
+</p>
+<p>
+ "Value increased."
+ <blockquote>
+ Show the address if the value has increased.
+ </blockquote>
+</p>
+
+<hr width="100%">
+<a name="examples"><h2>Examples</h2></a>
+<a name="examples-mm3"><h3>"Mega Man 3" Windows Example</h3></a>
+<p>
+ This example will give Mega Man unlimited energy.
+ Immediately after entering the Top Man stage, make your way to the
+ "Add Cheat" window. Push "Reset Search".
+ Go back to playing and move right until the first enemy appears. Allow
+ yourself to be hit twice. Each hit does "2" damage, so you've lost 4 energy
+ bars. Go to the "Add Cheat" window again and select the third filter
+ ("|O-C|==V2") and enter the value 4 next to "V2". Then push "Do Search".
+</p>
+<p>
+ Several addresses will appear in the address list. You can try to find
+ the address you want through trial and error, or you can narrow the results
+ down further. We will do the latter.
+</p>
+<p>
+ Go back to playing MM3 and get hit one more time and make your way back
+ to the "Add Cheat" window. Your damage is now "6". You can probably
+ see which address that contains your life(it is 00A2). If not, change
+ V2 to 6 and push "Do Search" again. This should leave only 00A2.
+</p>
+<p>
+ Select that entry in the address list. Shift your attention to the "Add
+ Cheat" box to the right. Type in a meaningful name and the desired value(156;
+ it was the value when you had no damage, so it's safe to assume it's the
+ maximum value you can use). Push the "Add" button and a new entry will
+ appear in the cheats list. The cheat has been added.
+</p>
+<a name="examples-oh"><h3>"Over Horizon" Text Interface Example</h3></a>
+<p>
+ This example will give you infinite lives in the NTSC(Japanese) version
+ of "Over Horizon".
+</p>
+<p>
+ Start a new game. Notice that when you press "Start" during gameplay,
+ the number of lives you have left is indicated. With no cheating, you
+ start with 3 lives(2 lives left).
+</p>
+<p>
+ Activate the cheat interface immediately after starting a new game.
+ Select the "New Cheats" menu and "Reset Search".
+</p>
+<p>
+ I'll assume that the number of lives left shown in the game is the same number
+ that's stored in RAM. Now, "Do Search". You're going to use the first search
+ filter. For V1, enter the value 2. For V2, enter the same value. This,
+ coupled with the fact that you just reset the search, will allow you to search
+ for a value "absolutely"(as opposed to changes in the value).
+</p>
+<p>
+ Now, "Show Results". When I did it, I received 11 results:
+</p>
+<pre>
+ 1) $0000:002:002
+ 2) $001c:002:002
+ 3) $001e:002:002
+ 4) $009d:002:002
+ 5) $00b9:002:002
+ 6) $00e3:002:002
+ 7) $0405:002:002
+ 8) $0406:002:002
+ 9) $0695:002:002
+ 10) $07d5:002:002
+ 11) $07f8:002:002
+</pre>
+<p>
+ You really can't do much yet(unless you want to spend time doing trial
+ and error cheat additions). Return to the game.
+</p>
+<p>
+ After losing a life, go back to the cheat interface, to the "New Cheats"
+ menu, and "Show Results". Here are my results:
+</p>
+<pre>
+ 1) $0000:002:002
+ 2) $001c:002:002
+ 3) $001e:002:002
+ 4) $009d:002:002
+ 5) $00b9:002:041
+ 6) $00e3:002:002
+ 7) $0405:002:001
+ 8) $0406:002:002
+ 9) $0695:002:002
+ 10) $07d5:002:001
+ 11) $07f8:002:002
+</pre>
+<p>
+ Notice that two addresses seem to hold the number of lives($0405 and
+ $07d5). You can lose another life and go "Show Results" again, and you
+ should see that $07d5 is the address that holds the number of lives.
+</p>
+<p>
+ Now that you know the address that holds the number of lives, you can
+ add a cheat. You can either type in the number from the cheat results list
+ corresponding to the address you want to add a cheat for, or you can
+ remember the address and select "Add Cheat" from the "New Cheats" menu.
+ Do the former.
+</p>
+<p>
+ Now you will need to enter a name for the cheat. I suggest something short,
+ but descriptive. "Infinite lives" will work fine. Next, a prompt for
+ the address will show up. Since you selected an item from the list, you
+ can press enter to use the associated address($07d5). Next, you will
+ need to enter a value. It doesn't need to be large(in fact, it probably
+ shouldn't be; abnormally high numbers can cause some games to misbehave).
+ I suggest a value of 2. After this, you should get a prompt that looks like
+ this:
+</p>
+<pre>
+ Add cheat "Infinite lives" for address $07d5 with value 002?(Y/N)[N]:
+</pre>
+<p>
+ Answer "Y". You now have infinite lives.
+</p>
+<hr width="100%">
+<a name="tips"><h2>Tips</h2></a>
+<p>
+ Games store player information in many different ways. For example,
+ if you have "3" lives in Super Wacky Dodgeball 1989, the game might store
+ it in memory as 2, 3, or 4, or perhaps a different number all together.
+ Also, say that you have 69 life points out of 200 in Mole Mashers. The
+ game might store how many life points you have, or how much damage you have
+ taken. Relative value searches are very valuable because you probably
+ don't know the way that the game stores its player data.
+</p>
+<p>
+ Some games, especially RPGs, deal with individual numbers greater than
+ 8-bits in size. Most that I've seen seem to store the multiple-byte data
+ least significant byte(lower byte of number) first in memory, though
+ conceivably, it could be stored most significant byte first, or the component
+ bytes of the number could be non-contiguous, though the latter is very unlikely.
+ For example, say I have 5304 experience points in Boring Quest for the
+ Overused Plot Device. To split the number into two eight bit decimal numbers,
+ take 5304 %(modulus) 256. This will give a number that is the lower 8 bits.
+ Next, take 5304 / 256. The integral component of your answer will be the
+ upper 8 bits(or the next 8 bits, if the number is or can be larger than 16
+ bits) of 5304. Now you will need to search for these numbers. Fortunately,
+ most(all?) RPGs seem to store large numbers exactly as they are shown in the
+ game.
+</p>
+</body>
+</html>
diff --git a/documentation/faq b/documentation/faq
new file mode 100644
index 00000000..69b2914e
--- /dev/null
+++ b/documentation/faq
@@ -0,0 +1,67 @@
+FCE Ultra General User's FAQ
+ preliminary version
+ Last updated on: Friday 13th, 2003
+------------------
+
+
+Q: Why do some games make a popping sound(Rad Racer 2, Final Fantasy 3)?
+A: These games do a very crude drum imitation by causing a large jump in
+ the output level for a short period of time via the register at $4011.
+ The analog filters on a real NES and Famicom make it sound rather decent.
+ I have not completely emulated these filters. Enabling high-quality
+ sound emulation will also make these pseudo-drums sound better. See
+ the next question for more information.
+
+Q: Why do some games' digitized sounds sound too loud?
+ Why do the drums in Crystalis and other games sound fuzzy?
+
+A: The NES' digital to analog converter is faulty, in that it does not output
+ sound linearly. This effect is most noteable when a games messes with
+ register $4011, which is added with the triangle wave channel and the noise
+ channel outputs. When $4011 is set to a large value, the volume
+ of the triangle wave channel and the noise channel drop significantly. More
+ Importantly, when digitized sounds are being played and the digitized sample
+ stream is at a high value, less changes will be noticeable. In other words,
+ the byte sequence "00 01 00" would be much more audible than the sequence
+ "7e 7f 7e". This non-linearity is only emulated when high-quality sound
+ emulation is enabled.
+
+Q: Why doesn't the NSF <insert name here> work(correctly) on FCE Ultra?
+A: Some NSF rips are bad. Some read from addresses that are not specified
+ in the NSF specifications, expecting certain values to be returned.
+ Others execute undocumented instructions that have no affect on
+ less-accurate software NSF players, but will cause problems on NSF players
+ that emulate these instructions. Also, the playback rate specified
+ in the NSF header is currently ignored, though I haven't encountered
+ any problems in doing this.
+
+Q: Why doesn't the game <insert name here> work(correctly) on FCE Ultra?
+A: Many factors can make a game not work on FCE Ultra:
+
+ - If the ROM image is in the iNES format(typically files that have
+ the extension "nes"), its header may be incorrect. This
+ incorrectness may also be because of garbage in the
+ header. Certain utilities used to put text in the reserved
+ bytes of the iNES header, then those reserved bytes were
+ later assigned functions. FCE Ultra recognizes and
+ automatically removes(from the ROM image in RAM, not on the
+ storage medium) SOME header junk.
+
+ If the game has graphical errors while scrolling, chances are
+ the mirroring is set incorrectly in the header.
+
+ You can try to edit the header with a utility(in the NES
+ utilities section at http://zophar.net ) or a hex editor.
+
+ - The on-cart hardware the game uses may not be emulated
+ correctly.
+
+ - Limitations of the ROM image format may prevent a game from
+ being emulated correctly without special code to recognize that
+ game. This occurs quite often with many Koei MMC5(iNES mapper 5)
+ and MMC1(iNES mapper 1) games in the iNES format. FCE Ultra identifies
+ and emulates some of these games based on the ROM CRC32 value.
+
+ - The ROM image may be encrypted. The author of SMYNES seems to
+ have done this intentionally to block other emulators from
+ playing "SMYNES only" games.
diff --git a/documentation/fceux-net-server.6 b/documentation/fceux-net-server.6
new file mode 100644
index 00000000..5cbb77ee
--- /dev/null
+++ b/documentation/fceux-net-server.6
@@ -0,0 +1,58 @@
+.\" (C) Copyright 2012 Joe Nahmias <jello@debian.org>
+.Dd November 19, 2014
+.Dt FCEUX-NET-SERVER 6
+.Os
+.Sh NAME
+.Nm fceux-net-server
+.Nd FCEUX game server
+.Sh SYNOPSIS
+.Nm fceux-net-server
+.Op Cm options
+.Sh DESCRIPTION
+This manual page documents briefly the
+.Nm
+command.
+.Pp
+.Nm
+is the game server for multiplayer use of the
+.Xr fceux 6
+family of NES emulators.
+This server will first look in
+.Pa /etc/fceux-server.conf
+for options.
+If that file does not exist, it will use the defaults given here.
+Any argument given directly will override any default values.
+.Sh OPTIONS
+The options are as follows:
+.Bl -tag -width Ds
+.It Fl h , Fl -help
+Displays a help message.
+.It Fl v , Fl -version
+Displays the version number and quits.
+.It Fl p Ar port , Fl -port Ar port
+Starts server on given port; the default is 4046.
+.It Fl w Ar password , Fl -password Ar password
+Specifies a password for entry.
+.It Fl m Ar num , Fl -maxclients Ar num
+Specifies the maximum number of clients allowed to access the server;
+the default is 100.
+.It Fl t Ar timeout , Fl -timeout Ar timeout
+Specifies the number of seconds before the server times out; the default is 5.
+.It Fl f Ar divisor , Fl -framedivisor Ar divisor
+Specifies frame divisor, which controls the number of updates sent to client;
+calculated as: 60 \(di framedivisor = updates per second.
+The default is 1.
+.It Fl c Ar file , Fl -configfile Ar file
+Loads the given configuration file.
+The default is
+.Pa /etc/fceux-server.conf .
+.El
+.Sh SEE ALSO
+.Xr fceux 6
+.Pp
+.Lk http://fceux.com/ "The FCEUX homepage" .
+.Sh AUTHORS
+.An -nosplit
+This manual page was written by
+.An Joe Nahmias Aq Mt jello@debian.org
+for the Debian GNU/Linux system (but may be used by others).
diff --git a/documentation/fceux.6 b/documentation/fceux.6
new file mode 100644
index 00000000..53f494da
--- /dev/null
+++ b/documentation/fceux.6
@@ -0,0 +1,326 @@
+.Dd November 19, 2014
+.Dt FCEUX 6
+.Os
+.Sh NAME
+.Nm fceux
+.Nd emulator for the Nintendo Entertainment System and Famicom
+.Sh SYNOPSIS
+.Nm fceux
+.Op Ar options
+.Ar file
+.Sh DESCRIPTION
+.Nm
+is an emulator for the original (8\(hybit) Nintendo Entertainment System (NES).
+It has a robust color palette rendering engine that is fully customizable,
+along with excellent sound and joystick support, and even supports movie
+recording and playback.
+.Sh OPTIONS
+.Ss Misc. Options
+.Bl -tag -width Ds
+.It Fl -no-config Cm 0 | 1
+Use default config file and do not save to it, when enabled.
+.It Fl g Cm 0 | 1 , Fl -gamegenie Cm 0 | 1
+Enable or disable emulated Game Genie.
+.It Fl -nogui Cm 0 | 1
+Enable or disable the GTK graphical interface.
+.It Fl -loadlua Ar file
+Loads Lua script from filename
+.Ar file .
+.El
+.Ss Emulation Options
+.Bl -tag -width Ds
+.It Fl -pal Cm 0 | 1
+Enable or disable PAL mode.
+.El
+.Ss Input Options
+.Bl -tag -width Ds
+.It Fl i Ar dev , Fl -inputcfg Ar dev
+Configures input device
+.Ar dev
+on startup.
+Devices include:
+.Cm gamepad powerpad hypershot quizking
+.It Fl -input1 Ar dev
+.It Fl -input2 Ar dev
+Set which input device to emulate for input 1 or 2.
+Devices include:
+.Cm gamepad zapper powerpad.0 powerpad.1 arkanoid
+.It Fl -input3 Ar dev
+.It Fl -input4 Ar dev
+Set the Famicom expansion device to emulate for input 3 or 4.
+Devices include:
+.Cm quizking hypershot mahjong toprider ftrainer familykeyboard oekakids
+.Cm arkanoid shadow bworld 4player
+.It Fl -inputdisplay Cm 0 | 1
+Enable or disable input display.
+.It Fl -fourscore Cm 0 | 1
+Enable or disable Fourscore emulation.
+.El
+.Ss Graphics Options
+.Bl -tag -width Ds
+.It Fl -newppu Cm 0 | 1
+Enable or disable the new PPU core.
+.Pq Sy Warning : No May break savestates
+.It Fl -frameskip Ar frames
+Set number of frames to skip per emulated frame.
+.It Fl -clipsides Cm 0 | 1
+Enable or disable clipping of the leftmost and rightmost 8 columns of the video
+output.
+.It Fl -slstart Ar scanline
+Set the first scanline to be rendered.
+.It Fl -slend Ar scanline
+Set the last scanline to be rendered.
+.It Fl -nospritelim Cm 0 | 1
+When set to 0, this disables the normal 8 sprites per scanline limitation.
+When set to 1, this enables the normal 8 sprites per scanline limitation.
+.Sy Note : No Yes, this option is Sq backwards .
+.It Fl x Ar xres , Fl -xres Ar xres
+Set horizontal resolution for full screen mode.
+.It Fl y Ar yres , Fl -yres Ar yres
+Set vertical resolution for full screen mode.
+.It Fl -doublebuf Cm 0 | 1
+Enable or disable double buffering.
+.It Fl -autoscale Cm 0 | 1
+Enable or disable autoscaling in fullscreen.
+.It Fl -keepratio Cm 0 | 1
+Keep native NES aspect ratio when autoscaling.
+.It Fl -xscale Ar val
+.It Fl -yscale Ar val
+Multiply width/height by
+.Ar val .
+.Ar val
+can be a real number greater than 0 with OpenGL output;
+otherwise, it must be an integer greater than 0.
+.It Fl -xstretch Cm 0 | 1
+.It Fl -ystretch Cm 0 | 1
+Stretch to fill surface on x/y axis (OpenGL only).
+.It Fl b Cm 8 | 16 | 24 | 32 , Fl -bpp Cm 8 | 16 | 24 | 32
+Set bits per pixel.
+.It Fl -opengl Cm 0 | 1
+Enable or disable OpenGL support.
+.It Fl -openglip Cm 0 | 1
+Enable or disable OpenGL linear interpolation.
+.It Fl f Cm 0 | 1 , Fl -fullscreen Cm 0 | 1
+Enable or disable full\(hyscreen mode.
+.It Fl -noframe Cm 0 | 1
+Hide title bar and window decorations.
+.It Fl -special Ar filter
+Use special video scaling filters.
+.Ar filter
+is a number from 1\(en5:
+.Bl -tag -compact -width a
+.It 1
+hq2x
+.It 2
+Scale2x
+.It 3
+NTSC
+.It 4
+hq3x
+.It 5
+Scale3x
+.El
+.It Fl p Ar file , Fl -palette Ar file
+Use the custom palette in
+.Ar file .
+.It Fl -ntsccolor Cm 0 | 1
+Enable or disable NTSC NES colors.
+.It Fl -tint Ar val
+Set the NTSC tint.
+.It Fl -hue Ar val
+Set the NTSC hue.
+.El
+.Ss Sound Options
+.Bl -tag -width Ds
+.It Fl s Cm 0 | 1 , Fl -sound Cm 0 | 1
+Enable or disable sound.
+.It Fl -soundrate Ar rate
+Set the sound playback sample rate (0 = off?).
+.It Fl -soundq Cm 0 | 1 | 2
+Set sound quality.
+.Bl -tag -width a -compact
+.It 0
+Low
+.It 1
+High
+.It 2
+Very high
+.El
+.It Fl -soundbufsize Ar n
+Set sound buffer size to
+.Ar n
+milliseconds.
+.It Fl -volume Ar val
+Set sound volume to the given value,
+which can range from 0 to a maximum of 256.
+.It Fl -trianglevol Ar val
+Set sound volume of the triangle wave to the given value,
+which can range from 0 to a maximum of 256.
+.It Fl -square1vol Ar val
+Set sound volume of square wave 1 to the given value,
+which can range from 0 to a maximum of 256.
+.It Fl -square2vol Ar val
+Set sound volume of square wave 2 to the given value,
+which can range from 0 to a maximum of 256.
+.It Fl -noisevol Ar val
+Set sound volume of the noise generator to the given value,
+which can range from 0 to a maximum of 256.
+.It Fl -lowpass Cm 0 | 1
+Enable or disable lowpass filtering of the sound.
+.It Fl -soundrecord Ar file
+Record sound to
+.Ar file .
+.El
+.Ss Movie Options
+.Bl -tag -width Ds
+.It Fl -playmov Ar file
+Play back a recorded FCM/FM2 movie from
+.Ar file .
+.It Fl -pauseframe Ar frame
+Pause movie playback at frame
+.Ar frame .
+.It Fl -moviemsg Cm 0 | 1
+Enable or disable movie messages.
+.It Fl -fcmconvert Ar file
+Convert fcm movie file
+.Ar file
+to fm2.
+.It Fl -ripsubs Ar file
+Convert movie\(cqs subtitles to SubRip (srt) subtitles.
+.It Fl -subtitles Cm 0 | 1
+Enable or disable subtitle display.
+.El
+.Ss Networking Options
+.Bl -tag -width Ds
+.It Fl n Ar server , Fl -net Ar server
+Connect to
+.Ar server
+for TCP/IP network play.
+.It Fl -port Ar port
+Use TCP/IP port
+.Ar port
+for network play.
+.It Fl u Ar nick , Fl -user Ar nick
+Set the nickname to use in network play.
+.It Fl w Ar pass , Fl -pass Ar pass
+Set password to use for connecting to the server.
+.It Fl k Ar netkey , Fl -netkey Ar netkey
+Use the string
+.Ar netkey
+to create a unique session for the game loaded.
+.It Fl -players Ar num
+Set the number of local players.
+.It Fl -rp2mic Cm 0 | 1
+If enabled, replace Port 2 Start with microphone (Famicom).
+.It Fl -videolog Ar c
+Calls mencoder to grab the video and audio streams to encode them.
+Check the documentation for more on this.
+.It Fl -mute Cm 0 | 1
+Mutes
+.Nm
+while still passing the audio stream to mencoder.
+.El
+.Sh KEYBOARD COMMANDS
+.Nm
+has a number of commands available within the emulator.
+It also includes default keyboard bindings when emulating game pads
+or power pads.
+.Ss Gamepad Keyboard Bindings
+.TS
+center box;
+cb | cb, cb | c.
+NES Gamepad Keyboard
+=
+\(ua Keypad Up
+\(da Keypad Down
+\(<- Keypad Left
+\(-> Keypad Right
+A F
+B D
+Select S
+Start Enter
+.TE
+.Ss Other Commands
+.Bl -tag -width "Aq Alt+Enter"
+.It Aq Cm Alt Ns + Ns Cm Enter
+Toggle full\(hyscreen mode.
+.It Aq Cm F1
+Cheat menu (command\(hyline only).
+.It Aq Cm F2
+Toggle savestate binding to movies.
+.It Aq Cm F3
+Load Lua script.
+.It Aq Cm F4
+Toggles background rendering.
+.It Aq Cm F5
+Save game state into current slot (set using number keys).
+.It Aq Cm F7
+Restore game state from current slot (set using number keys).
+.It Aq Cm F10
+Toggle movie subtitles.
+.It Aq Cm F11
+Reset NES.
+.It Aq Cm F12
+Save screen snapshot.
+.It Aq Cm Shift Ns + Ns Cm F5
+Begin recording video.
+.It Aq Cm Shift Ns + Ns Cm F7
+Load recorded video.
+.It Cm 0 Ns \(en Ns Cm 9
+Select the numbered save state slot.
+.It Ao Cm Page Up Ac / Aq Cm Page Down
+Select next/previous state.
+.It Cm -
+Decrease emulation speed.
+.It Cm =
+Increase emulation speed.
+.It Aq Cm Tab
+Hold for turbo emulation speed.
+.It Aq Cm Pause
+Pause emulation.
+.It Cm \e
+Advance a single frame.
+.It Cm \&.
+Toggle movie frame counter.
+.It Cm \&,
+Toggle input display.
+.It Cm q
+Toggle movie read\(hyonly.
+.It Cm \(aq
+Advance a single frame.
+.It Cm /
+Lag counter display.
+.It Aq Cm Delete
+Frame advance lag skip display.
+.It Aq Cm Esc
+Quit
+.Nm .
+.El
+.Ss VS Unisystem Commands
+.Bl -tag -width "Aq F8"
+.It Aq Cm F8
+Insert coin.
+.It Aq Cm F6
+Show/hide dip switches.
+.It Cm 1 Ns \(en Ns Cm 8
+Toggle dip switches (when dip switches are shown).
+.El
+.Ss Famicom Disk System Commands
+.Bl -tag -width "Aq F6"
+.It Aq Cm F6
+Select disk and disk side.
+.It Aq Cm F8
+Eject or insert disk.
+.El
+.Sh SEE ALSO
+.Xr fceux-net-server 6
+.Pp
+.Lk http://fceux.com/ "The FCEUX homepage" .
+.Sh AUTHORS
+.An -nosplit
+This manual page was written by
+.An Joe Nahmias Aq Mt joe@nahmias.net ,
+.An Lukas Sabota Aq Mt ltsmooth42@gmail.com
+and
+.An Alexander Toresson Aq Mt alexander.toresson@gmail.com
+for the Debian GNU/Linux system (but may be used by others).
diff --git a/documentation/fcs.txt b/documentation/fcs.txt
new file mode 100644
index 00000000..dbaeec53
--- /dev/null
+++ b/documentation/fcs.txt
@@ -0,0 +1,153 @@
+FCE Ultra Save State Format
+ Updated: Mar 9, 2003
+---------------------------------------
+
+FCE Ultra's save state format is now designed to be as forward and backwards
+compatible as possible. This is achieved through the (over)use of chunks.
+All multiple-byte variables are stored LSB(least significant byte)-first.
+Data types:
+
+ (u)int8 - (un)signed 8 bit variable(also referred to as "byte")
+ (u)int16 - (un)signed 16 bit variable
+ (u)int32 - (un)signed 32 bit variable
+
+-- Main File Header:
+
+The main file header is 16-bytes in length. The first three bytes contain
+the string "FCS". The next byte contains the version of FCE Ultra that saved
+this save state. This document only applies to version "53"(.53) and higher.
+After the version byte, the size of the entire file in bytes(minus the 16 byte
+main file header) is stored. The rest of the header is currently unused
+and should be nulled out. Example of relevant parts:
+
+ FCS <uint8 version> <uint32 totalsize>
+
+-- Section Chunks:
+
+Sections chunk headers are 5-bytes in length. The first byte defines what
+section it is, the next four bytes define the total size of the section
+(including the section chunk header).
+
+ <uint8 section> <uint32 size>
+
+Section definitions:
+
+ 1 - "CPU"
+ 2 - "CPUC"
+ 3 - "PPU"
+ 4 - "CTLR"
+ 5 - "SND"
+ 16 - "EXTRA"
+
+-- Subsection Chunks
+
+Subsection chunks are stored within section chunks. They contain the actual
+state data. Each subsection chunk is composed of an 8-byte header and the data.
+The header contains a description(a name) and the size of the data contained
+in the chunk:
+ <uint8 description[4]> <uint32 size>
+
+The name is a four-byte string. It does not need to be null-terminated.
+If the string is less than four bytes in length, the remaining unused bytes
+must be null.
+
+-- Subsection Chunk Description Definitions
+
+Note that not all subsection chunk description definitions listed below
+are guaranteed to be in the section chunk. It's just a list of what CAN
+be in a section chunk. This especially applies to the "EXTRA" subsection.
+
+---- Section "CPU"
+
+ Name: Type: Description:
+
+ PC uint16 Program Counter
+ A uint8 Accumulator
+ P uint8 Processor status register
+ X uint8 X register
+ Y uint8 Y register
+ S uint8 Stack pointer
+ RAM uint8[0x800] 2KB work RAM
+
+---- Section "CPUC" (emulator specific)
+
+ Name: Type: Description:
+
+ JAMM uint8 Non-zero value if CPU in a "jammed" state
+ IRQL uint8 Non-zero value if IRQs are to be generated constantly
+ ICoa int32 Temporary cycle counter
+ ICou int32 Cycle counter
+
+---- Section "PPU"
+
+ Name: Type: Description:
+
+ NTAR uint8[0x800] 2 KB of name/attribute table RAM
+ PRAM uint8[32] 32 bytes of palette index RAM
+ SPRA uint8[0x100] 256 bytes of sprite RAM
+ PPU uint8[4] Last values written to $2000 and $2001, the PPU
+ status register, and the last value written to
+ $2003.
+ XOFF uint8 Tile X-offset.
+ VTOG uint8 Toggle used by $2005 and $2006.
+ RADD uint16 PPU Address Register(address written to/read from
+ when $2007 is accessed).
+ TADD uint16 PPU Address Register
+ VBUF uint8 VRAM Read Buffer
+ PGEN uint8 PPU "general" latch. See Ki's document.
+
+---- Section "CTLR" (somewhat emulator specific)
+
+ Name: Type: Description:
+
+ J1RB uint8 Bit to be returned when first joystick is read.
+ J2RB uint8 Bit to be returned when second joystick is read.
+
+---- Section "SND" (somewhat emulator specific)
+
+ NREG uint16 Noise LFSR.
+ P17 uint8 Last byte written to $4017.
+ PBIN uint8 DMC bit index.
+ PAIN uint32 DMC address index(from $8000).
+ PSIN uint32 DMC length counter(how many bytes left
+ to fetch).
+
+ <to be finished>
+
+---- Section "EXTRA" (varying emulator specificness)
+
+ For iNES-format games(incomplete, and doesn't apply to every game):
+
+ Name: Type: Description:
+
+ WRAM uint8[0x2000] 8KB of WRAM at $6000-$7fff
+ MEXR uint8[0x8000] (very emulator specific)
+ CHRR uint8[0x2000] 8KB of CHR RAM at $0000-$1fff(in PPU address space).
+ EXNR uint8[0x800] Extra 2KB of name/attribute table RAM.
+ MPBY uint8[32] (very emulator specific)
+ MIRR uint8 Current mirroring:
+ 0 = "Horizontal"
+ 1 = "Vertical"
+ $10 = Mirror from $2000
+ $11 = Mirror from $2400
+ IRQC uint32 Generic IRQ counter
+ IQL1 uint32 Generic IRQ latch
+ IQL2 uint32 Generic IRQ latch
+ IRQA uint8 Generic IRQ on/off register.
+ PBL uint8[4] List of 4 8KB ROM banks paged in at $8000-$FFFF
+ CBL uint8[8] List of 8 1KB VROM banks page in at $0000-$1FFF(PPU).
+
+ For FDS games(incomplete):
+
+ Name: Type: Description:
+
+ DDT<x> uint8[65500] Disk data for side x(0-3).
+ FDSR uint8[0x8000] 32 KB of work RAM
+ CHRR uint8[0x2000] 8 KB of CHR RAM
+ IRQC uint32 IRQ counter
+ IQL1 uint32 IRQ latch
+ IRQA uint8 IRQ on/off.
+
+ WAVE uint8[64] Carrier waveform data.
+ MWAV uint8[32] Modulator waveform data.
+ AMPL uint8[2] Amplitude data.
diff --git a/documentation/fm2.txt b/documentation/fm2.txt
new file mode 100644
index 00000000..37ec4969
--- /dev/null
+++ b/documentation/fm2.txt
@@ -0,0 +1,79 @@
+FM2 is ascii plain text.
+It consists of several key-value pairs followed by an inputlog section.
+The inputlog section can be identified by its starting with a | (pipe).
+The inputlog section terminates at eof.
+Newlines may be \r\n or \n
+
+Key-value pairs consist of a key identifier, followed by a space separator, followed by the value text.
+Value text is always terminated by a newline, which the value text will not include.
+The value text is parsed differently depending on the type of the key.
+The key-value pairs may be in any order, except that the first key must be version.
+
+Integer keys (also used for booleans, with a 1 or 0) will have a value that is a simple integer not to exceed 32bits
+ - version (required) - the version of the movie file format; for now it is always 3
+ - emuVersion (required) - the version of the emulator used to produce the movie
+ - rerecordCount (optional) - the rerecord count
+ - palFlag (bool) (optional) - true if the movie uses pal timing
+ - fourscore (bool) (*note C) - true if a fourscore was used
+ - port0, port1 (*note C) - indicates the types of input devices. Supported values are:
+ SI_GAMEPAD = 1,
+ SI_ZAPPER = 2
+ - port2 (required) - indicates the type of the FCExp port device which was attached. Supported values are:
+ SIFC_NONE = 0
+
+String keys have values that consist of the remainder of the key-value pair line. As a consequence, string values cannot contain newlines.
+ - romFilename (required) - the name of the file used to record the movie
+ - comment (optional) - simply a memo.
+ by convention, the first token in the comment value is the subject of the comment.
+ by convention, subsequent comments with the same subject will have their ordering preserved and may be used to approximate multiline comments.
+ by convention, the author of the movie should be stored in comment(s) with a subject of: author
+
+Hex string keys (used for binary blobs) will have a value that is like 0x0123456789ABCDEF...
+ - romChecksum (required) - the MD5 hash of the rom which was used to record the movie
+ - savestate (optional) - a fcs savestate blob, in case a movie was recorded from savestate
+
+GUID keys have a value which is in the standard guid format: 452DE2C3-EF43-2FA9-77AC-0677FC51543B
+ - guid (required) a unique identifier for a movie, generated when the movie is created, which is used when loading a savestate to make sure it belongs to the current movie.
+
+The inputlog section consists of lines beginning and ending with a | (pipe).
+The fields are as follows, except as noted in note C.
+|c|port0|port1|port2|
+
+field c is a variable length decimal integer which is a bitfield corresponding to miscellaneous input states which are valid at the start of the frame.
+Current values for this are
+MOVIECMD_RESET = 1
+
+the format of port0, port1, port2 depends on which types of devices were attached.
+SI_GAMEPAD:
+ the field consists of eight characters which constitute a bitfield.
+ any character other than ' ' or '.' means that the button was pressed.
+ by convention, the following mnemonics will be used in a column to remind us of which button corresponds to which column:
+ RLDUTSBA (Right,Left,Down,Up,sTart,Select,B,A)
+ This seemingly arbitrary ordering is actually the reverse of the originally-desired order, which was screwed up in the first release of FCEUX. So we have preserved it for compatibility's sake.
+SI_ZAPPER:
+ XXX YYY B Q Z
+ XXX: %03d, the x position of the mouse
+ YYY: %03d, the y position of the mouse
+ B: %1d, 1 if the mouse button is pressed; 0 if not
+ Q: %1d, an internal value used by the emulator's zapper code (this is most unfortunate..)
+ Z: %d, a variable-length decimal integer; an internal value used by the emulator's zapper code (this is even more unfortunate..)
+SIFC_NONE:
+ this field must always be empty.
+
+* Notes *
+A. There is no key-value pair that indicates the length of the movie. This must be read by scanning the inputlog and counting the number of lines.
+
+B. All movies start from power-on, unless a savestate key-value is present.
+
+C.
+If a fourscore is used, then port0 and port1 are irrelevant and ignored.
+The input types must all be gamepads, and the inputlog will be in the following format:
+ {player1 player2 player3 player4}
+|c|RLDUTSBA|RLDUTSBA|RLDUTSBA|RLDUTSBA|port2|
+If a fourscore is not used, then port0 and port1 are required.
+
+D. The emulator uses these framerate constants
+ - NTSC: 1008307711 /256/65536 = 60.099822938442230224609375
+ - PAL : 838977920 /256/65536 = 50.00698089599609375
+
+E. The author of this format is curious about what people think of it. Please let him know! \ No newline at end of file
diff --git a/documentation/porting.txt b/documentation/porting.txt
new file mode 100644
index 00000000..63d2b4fc
--- /dev/null
+++ b/documentation/porting.txt
@@ -0,0 +1,289 @@
+FCE Ultra Porting Guide
+ Updated: October 4, 2003
+
+*Incomplete*
+
+
+***Driver-supplied functions:
+ These functions will only be called after the driver code calls
+ FCEUI_LoadGame() or FCEUI_Emulate().
+
+void FCEUD_Update(uint8 *XBuf, int32 *Buffer, int Count);
+ Called by FCE Ultra on every emulated frame. This function should
+ perform the following three things(in any order):
+
+ 1.
+ Update the data pointed to by the pointers passed to
+ FCEUI_SetInput() and FCEUI_SetInputFC().
+ 2.
+ Copy contents of XBuf over to video memory(or whatever needs to be
+ done to make the contents of XBuf visible on screen).
+ Each line is 256 pixels(and bytes) in width, and there can be 240
+ lines. The pitch for each line is 272 bytes.
+ XBuf will be 0 if the symbol FRAMESKIP is defined and this frame
+ was skipped.
+ 3.
+ Write the contents of "Buffer" to the sound device. "Count" is the
+ number of samples to write. Only the lower 16-bits of each
+ sample are used, so each 32-bit sample in "Buffer" can be converted to
+ signed 16-bit by dropping the upper 16 bits.
+ When sound was disabled for the frame, "Count" will be 0.
+
+void FCEUD_SetPalette(uint8 index, uint8 r, uint8 g, uint8 b);
+ Set palette entry "index" to specified RGB values(value: min=0, max=255).
+
+void FCEUD_GetPalette(uint8 index, uint8 *r, uint8 *g, uint8 *b);
+ Get palette entry "index" data.
+
+void FCEUD_PrintError(char *s);
+ Print/Display an error message string pointed to by "s".
+
+void FCEUD_Message(char *s);
+ Display a status message string.
+
+
+int FCEUD_NetworkConnect(void);
+ Initialize a network connection. Return 0 if an error occurs.
+
+int FCEUD_GetDataFromClients(uint8 *data);
+
+/* Sends 5 bytes of data to all clients. */
+int FCEUD_SendDataToClients(uint8 *data);
+
+/* Sends 1 byte of data to server, and maybe a command. */
+int FCEUD_SendDataToServer(uint8 v, uint8 cmd);
+
+/* Gets 5 bytes of data from the server. This function must block. */
+int FCEUD_GetDataFromServer(uint8 *data);
+
+void FCEUD_NetworkClose(void);
+ Close the network connection.
+
+
+***FCE Ultra functions(called by the driver code):
+ The FCEUI_* functions may only be called before FCEUI_Emulate() is
+ called or after it returns and before it is called again, or after the
+ following functions are called and before they return:
+ FCEUD_Update();
+ Calling the FCEUI_* functions at any other time may result in
+ undefined behavior.
+
+void FCEUI_SetInput(int port, int type, void *ptr, int attrib);
+ "port" can be either 0 or 1, and corresponds to the physical
+ ports on the front of a NES.
+
+ "type" may be:
+ SI_NONE - No input on this port.
+ SI_GAMEPAD - Standard NES gamepad
+ SI_ZAPPER - "Zapper" light gun.
+ SI_POWERPAD - Power-pad mat.
+ SI_ARKANOID - Arkanoid controller.
+
+void FCEUI_SetInputFC(int type, void *ptr, int attrib);
+ Special Famicom devices.
+ "type" may be:
+ SIFC_NONE - No input here.
+ SIFC_ARKANOID - Arkanoid controller.
+ SIFC_SHADOW - "Space Shadow" gun.
+ SIFC_4PLAYER - Famicom 4-player adapter
+ SIFC_FKB - Family Keyboard
+
+void FCEUI_DisableFourScore(int s);
+ Disables four-score emulation if s is nonzero.
+
+void FCEUI_SetSnapName(int a);
+ 0 to order screen snapshots numerically(0.png), 1 to order them file
+ base-numerically(smb3-0.png).
+
+void FCEUI_DisableSpriteLimitation(int a);
+ Disables the 8-sprite-per-scanline limitation of the NES if "a"
+ is nonzero. The default behavior is the limitation is enabled.
+
+void FCEUI_SaveExtraDataUnderBase(int a);
+ If "a" is nonzero, save extra non-volatile game data(battery-backed
+ RAM) under FCE Ultra's base directory. Otherwise, the behavior is
+ to save it under the same directory the game is located in(this is
+ the default behavior).
+
+FCEUGI *FCEUI_LoadGame(char *name);
+ Loads a new file. "name" is the full path of the file to load.
+ Returns 0 on failure, or a pointer to data type "FCEUGI":
+ See file "git.h" for more details on this structure.
+
+int FCEUI_Initialize(void);
+ Allocates and initializes memory. Should only be called once, before
+ any calls to other FCEU functions.
+
+void FCEUI_SetBaseDirectory(void);
+ Specifies the base FCE Ultra directory. This should be called
+ immediately after FCEUI_Initialize() and any time afterwards.
+
+void FCEUI_SetDirOverride(int which, char *n);
+
+ FCEUIOD_CHEATS - Cheats
+ FCEUIOD_MISC - Miscellaneous stuff(custom game palettes)
+ FCEUIOD_NV - Non-volatile game data(battery backed RAM)
+ FCEUIOD_SNAPS - Screen snapshots
+ FCEUIOD_STATE - Save states
+
+void FCEUI_Emulate(void);
+ Enters the emulation loop. This loop will be exited when FCEUI_CloseGame()
+ is called. This function obviously shouldn't be called if FCEUI_LoadGame()
+ wasn't called or FCEUI_CloseGame() was called after FCEUI_LoadGame().
+
+void FCEUI_CloseGame(void);
+ Closes the loaded game and frees all memory used to load it.
+ Also causes FCEUI_Emulate() to return.
+
+void FCEUI_ResetNES(void);
+void FCEUI_PowerNES(void);
+
+void FCEUI_SetRenderedLines(int ntscf, int ntscl, int palf, int pall);
+ Sets the first(minimum is 0) and last(NOT the last scanline plus one;
+ maximum is 239) scanlines of background data to draw, for both NTSC
+ emulation mode and PAL emulation mode.
+
+ Defaults are as if this function were called with the variables set
+ up as follows:
+ ntscf=8, ntscl=231, palf=0, pall=239
+
+void FCEUI_SetNetworkPlay(int type);
+ Sets status of network play according to "type". If type is 0,
+ then network play is disabled. If type is 1, then we are server.
+ If type is 2, then we are a client.
+
+void FCEUI_SelectState(int w);
+ Selects the state "slot" to save to and load from.
+
+void FCEUI_SaveState(void);
+ Saves the current virtual NES state from the "slot" selected by
+ FCEUI_SelectState().
+
+void FCEUI_LoadState(void);
+ Loads the current virtual NES state from the "slot" selected by
+ FCEUI_SelectState().
+
+void FCEUI_SaveSnapshot(void);
+ Saves a screen snapshot.
+
+void FCEUI_DispMessage(char *msg);
+ Displays a short, one-line message using FCE Ultra's built-in
+ functions and ASCII font data.
+
+int32 FCEUI_GetDesiredFPS(void);
+ Returns the desired FPS based on whether NTSC or PAL emulation is
+ enabled, shifted left by 24 bits(this is necessary because the real
+ FPS value is not a whole integer). This function should only be
+ necessary if sound emulation is disabled.
+
+int FCEUI_GetCurrentVidSystem(int *slstart, int *slend);
+ Convenience function(not strictly necessary, but reduces excessive code
+ duplication); returns currently emulated video system
+ (0=NTSC, 1=PAL). It will also set the variables pointed to by slstart
+ and slend to the first and last scanlines to be rendered, respectively,
+ if slstart and slend are not 0.
+
+void FCEUI_GetNTSCTH(int *tint, int *hue);
+void FCEUI_SetNTSCTH(int n, int tint, int hue);
+
+int FCEUI_AddCheat(char *name, uint16 addr, uint8 val);
+ Adds a RAM cheat with the specified name to patch the address "addr"
+ with the value "val".
+
+int FCEUI_DelCheat(uint32 which);
+ Deletes the specified(by number) cheat.
+
+void FCEUI_ListCheats(void (*callb)(char *name, uint16 a, uint8 v));
+ Causes FCE Ultra to go through the list of all cheats loaded for
+ the current game and call callb() for each cheat with the cheat
+ information.
+
+int FCEUI_GetCheat(uint32 which, char **name, int32 *a, int32 *v, int *s);
+ Gets information on the cheat referenced by "which".
+
+int FCEUI_SetCheat(uint32 which, char *name, int32 a, int32 v, int s);
+ Sets information for the cheat referenced by "which".
+
+void FCEUI_CheatSearchBegin(void);
+ Begins the cheat search process. Current RAM values are copied
+ to a buffer to later be processed by the other cheat search functions.
+
+void FCEUI_CheatSearchEnd(int type, int v1, int v2);
+ Searches the buffer using the search method specified by "type"
+ and the parameters "v1" and "v2".
+
+int32 FCEUI_CheatSearchGetCount(void);
+ Returns the number of matches from the cheat search.
+
+void FCEUI_CheatSearchGet(void (*callb)(uint16 a, int last, int current));
+
+void FCEUI_CheatSearchGetRange(int first, int last, void (*callb)(uint16 a, int last, int current));
+ Like FCEUI_CheatSearchGet(), but you can specify the first and last
+ matches to get.
+
+void FCEUI_CheatSearchShowExcluded(void);
+ Undos any exclusions of valid addresses done by FCEUI_CheatSearchEnd().
+
+void FCEUI_CheatSearchSetCurrentAsOriginal(void);
+ Copies the current values in RAM into the cheat search buffer.
+
+void FCEUI_MemDump(uint16 a, int32 len, void (*callb)(uint16 a, uint8 v));
+ Callback to dump memory.
+
+void FCEUI_DumpMem(char *fname, uint32 start, uint32 end);
+ Dump memory to filename fname.
+
+void FCEUI_MemPoke(uint16 a, uint8 v, int hl);
+ Write a byte to specified address. Set hl to 1 to attempt to store
+ it to ROM("high-level" write).
+
+void FCEUI_NMI(void);
+ Triggers(queues) an NMI.
+
+void FCEUI_IRQ(void);
+ Triggers(queues) an IRQ.
+
+void FCEUI_Disassemble(uint16 a, int (*callb)(uint16 a, char *s));
+ Text disassembly.
+
+void FCEUI_GetIVectors(uint16 *reset, uint16 *irq, uint16 *nmi);
+ Get current interrupt vectors.
+
+
+***Recognized defined symbols:
+
+The following defined symbols affect the way FCE Ultra is compiled:
+
+ C80x86
+ - Include 80x86 inline assembly in AT&T syntax, if available. Also
+ use special 80x86-specific C constructs if the compiler is compatible.
+
+ FRAMESKIP
+ - Include frame skipping code.
+
+ NETWORK
+ - Include network play code.
+
+ FPS
+ - Compile code that prints out a number when FCE Ultra exits
+ that represents the average fps.
+
+ ZLIB
+ - Compile support for compressed PKZIP-style files AND gzip compressed
+ files. "unzip.c" will need to be compiled and linked in by you if
+ this is defined(it's in the zlib subdirectory).
+
+ LSB_FIRST
+ - Compile code to expect that variables that are greater than 8 bits
+ in size are stored Least Significant Byte First in memory.
+
+ PSS_STYLE x
+ - Sets the path separator style to the integer 'x'. Valid styles are:
+ 1: Only "/" - For UNIX platforms.
+ 2: Both "/" and "\" - For Windows and MSDOS platforms.
+ 3: Only "\" - For ???.
+ 4: Only ":" - For Apple IIs ^_^.
+
+
+
+
diff --git a/documentation/protocol.txt b/documentation/protocol.txt
new file mode 100644
index 00000000..0122e498
--- /dev/null
+++ b/documentation/protocol.txt
@@ -0,0 +1,90 @@
+FCE Ultra 0.91+ network play protocol.
+Description v 0.0.1
+--------------------------------------
+
+In FCE Ultra, all data is sent to the server, and then the server
+distributes, every emulated frame(60hz on NTSC), the collated data to the
+clients.
+
+The server should not block when it is receiving UDP data from the clients.
+If no UDP data is available, then just go on.
+
+The clients MUST block until the input data packet comes on every emulated
+frame.
+
+Packets from the server to the client are sent out over both TCP and UDP.
+Duplicate packets should be discarded. Out-of-order packets can either
+be cached, or discarded(what I recommend, as caching code gets a little
+complex and wouldn't yield any benefit from what I've observed).
+In the case of client->server UDP communications, the server should just use
+the data from the UDP packet that has the highest packet number count, and
+the server should then set its internal incoming packet counter to that
+number(to prevent out-of-order packets from totally screwing up user input).
+
+The "magic number"(used with UDP packets) is meant to reduce the chance of a hostile remote host
+from disrupting the network play, without resorting to using extreme amounts
+of network bandwidth. The server generates the magic number, and it is best
+if the magic number is as random as possible.
+UDP packets received with an incorrect magic number should be discarded, of
+course.
+
+Initialization, server->client:
+
+uint32 Local UDP port(what the server is listening on).
+uint32 Player number(0-3) for client.
+uint32 Magic number(for UDP).
+
+
+Initialization, client->server
+
+uint32 Local UDP port(that the client is listening on).
+
+
+Structure of UDP packet data:
+
+uint32 CRC32 - Includes magic number, packet counter, and
+ data. For reference, CRC32 is calculated
+ with the zlib function crc32().
+uint32 Magic number
+uint32 Packet counter(linear, starts at 0).
+uint8[variable] Data.
+
+Structure of tcp packet data:
+
+uint32 Packet counter(" ").
+uint8[variable] Data.
+
+
+
+Data format of server->client communications:
+
+ uint8[4] Controller data
+ uint8 Command byte. 0 if no command. Otherwise(in decimal):
+
+ 1 Select FDS disk side.
+ 2 Insert/eject FDS disk.
+ 10 Toggle VS Unisystem dip switch editing.
+ 11 ... 18 Toggle VS Unisystem dip switches.
+ 19 Insert VS Unisystem coin.
+ 30 Reset NES.
+ 31 Power toggle NES.
+ 40 Save state(not implemented correctly).
+ 41 Load state(not implemented correctly).
+ 42 ... 50 Select save state slot(minus 42).
+
+ Special message communications occurs if the "Packet counter" is
+ 0xFFFFFFFF(only with TCP):
+
+ uint32 Length of text data, minus the null character(the null
+ character is sent, though).
+ uint8[variable] Text data. Convert all characters <32 to space, and
+ then display the text message(it's one line) as is.
+
+Structure of client->server communication:
+
+ uint8 Controller data(for this client).
+
+ Over tcp channel, a text message can be sent. It is one line,
+ null terminated(remember the data and parse it and display it and
+ distribute it to the clients once the null byte is received).
+ Maximum size of message(including the null byte) should be 256 bytes.
diff --git a/documentation/snes9x-lua.html b/documentation/snes9x-lua.html
new file mode 100644
index 00000000..13ab2bb5
--- /dev/null
+++ b/documentation/snes9x-lua.html
@@ -0,0 +1,417 @@
+<html><head>
+<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1"><title>Snes9x Lua Library</title></head><body>
+This is the API from DeHackEd's version of Lua in ZSnes. At the time of writing, FCUE's Lua was based on this API.<br /><br />
+
+<h1>Basics</h1>
+Your code will be run alongside the emulator's main loop. You code should
+probably look roughly like this:
+<hr>
+<pre>-- initialization goes here
+while condition do
+ -- Code executed once per frame
+
+ snes9x.frameadvance()
+
+
+end
+
+-- Cleanup goes here
+</pre><hr>
+
+When Lua execution starts, the emulator will be automatically unpaused if it
+is currently paused. If so, it will automatically be paused when the script
+exits, voluntarily or otherwise. This allows you to have a script execute
+some work on your behalf and then when it exits the emulator will be paused,
+ready for the player to continue use.
+
+<br>
+<h1>Base library</h1>
+Handy little things that are not put into a class. Mostly binary operations right now.
+<hr>
+<a name="AND">
+</a><pre><a name="AND">int AND(int arg1, int arg2, ..., int argn)</a></pre>
+<a name="AND">Since Lua lacks binary operators and since binary will come up with memory manipulation,
+I offer this function. Output is the binary AND of all its parameters together.
+Minimum 1 argument, all integers.
+</a><p>
+<a name="AND">At a binary level, the AND of two binary bits is 1 if both inputs are 1, and the output
+is 0 in any other case. Commonly used to test if a bit is set by ANDing with a number
+with only the desired position set to 1.
+</a></p><hr>
+<a name="OR">
+</a><pre><a name="OR">int OR(int arg1, int arg2, ..., int argn)</a></pre>
+<a name="OR">The OR of two bits is 1 if either of the inputs is 1, and 0 if both inputs are 0.
+Typically used to force a single bit to 1, regardless of its current state.
+</a><hr>
+<a name="XOR">
+</a><pre><a name="XOR">int XOR(int arg1, int arg2, ..., int argn)</a></pre>
+<a name="XOR">XOR flips bits. An even number of 1s yields a zero and an odd number of 1s yields a 1.
+Commonly used to toggle a bit by XORing.
+</a><hr>
+<a name="BIT">
+</a><pre><a name="BIT">int BIT(int which)</a></pre>
+<a name="BIT">Returns a number with only the given bit set. <tt>which</tt> is in the range from 0 to 15 since the
+SNES is a 16 bit system. <tt>BIT(15) == 32768</tt>
+</a><p>
+<a name="BIT">... Actually this system will accept a range of 0 to 30, but none of the memory access functions will
+accept it, so you're on your own for those. 31 is not allowed for now due to signedness risking wreaking havoc.
+</a></p><h1><a name="BIT">snes9x</a></h1>
+
+<a name="BIT">Basic master emulator control.
+</a><hr>
+<a name="snes9x.speedmode">
+</a><pre><a name="snes9x.speedmode">snes9x.speedmode(string mode)</a></pre>
+<a name="snes9x.speedmode">Selects the speed mode snes9x should run at while Lua is in control of frame advance. It must be set to one of the following:
+</a><ul>
+<li><a name="snes9x.speedmode"><b><tt>normal</tt></b> sets to normal operation. The game runs at its normal speed. Speed control (eg: 50%) apply.</a></li>
+<li><a name="snes9x.speedmode"><b><tt>nothrottle</tt></b> makes snes9x run at maximum CPU speed while still showing each frame on screen.
+</a></li><li><a name="snes9x.speedmode"><b><tt>turbo</tt></b> drops some frames. It looks like high speed fast-forwarding.</a></li>
+<li><a name="snes9x.speedmode"><b><tt>maximum</tt></b> disables screen rendering</a></li>
+</ul>
+<a name="snes9x.speedmode">In modes other than normal, pause will have no effect.
+
+</a><hr>
+<a name="snes9x.frameadvance">
+</a><pre><a name="snes9x.frameadvance">snes9x.frameadvance()</a></pre>
+<a name="snes9x.frameadvance">Snes9x executes one frame. This function pauses until the execution finishes. General system slowdown
+when running at normal speed (ie. sleeping for 1/60 seconds) also occurs here when not in high speed mode.
+</a><p>
+<a name="snes9x.frameadvance">Warning: Due to the way the code is written, the times this function may be called is restricted. Norably,
+it must not be called within a coroutine or under a [x]pcall(). You can use coroutines for
+your own purposes, but they must not call this function themselves. Furthermore, this function cannot be called from any
+"registered" callback function. An error will occur if you do.
+</a></p><p>
+</p><hr>
+<a name="snes9x.message">
+</a><pre><a name="snes9x.message">snes9x.message(string msg)</a></pre>
+<a name="snes9x.message">Displays the indicated string on the user's screen. snes9x.speedmode("normal") is probably the only way this is of any use,
+lest the message not be displayed at all
+</a><hr>
+<a name="snes9x.pause">
+</a><pre><a name="snes9x.pause">snes9x.pause()</a></pre>
+<a name="snes9x.pause"><font color="#800000">v0.05+ only</font><br>
+Pauses the emulator. This function blocks until the user unpauses.
+</a><p>
+<a name="snes9x.pause">This function is allowed to be called from outside a frame boundary (ie. when it is not allowed to call
+snes9x.frameadvance). In this case, the function does not wait for the pause because you can't pause
+midway through a frame. Your code will continue to execute and the emulator will be paused at the end
+of the current frame. If you are at a frame boundary, this function acts a lot like snes9x.frameadvance()
+plus the whole pause thing.
+</a></p><p>
+<a name="snes9x.pause">It might be smart to reset the speed mode to "normal" if it is not already so.
+</a></p><hr>
+<a name="snes9x.wait">
+</a><pre><a name="snes9x.wait">snes9x.wait()</a></pre>
+<a name="snes9x.wait"><font color="#800000">v0.06+ only</font><br>
+Skips emulation of the next frame. If your script needs to wait for something to happen before proceeding (eg.
+input from another application) then you should call this. Otherwise the GUI might jam up and your
+application will not appear to be responding and need termination. It is expected that this function
+will pause the script for 1/60 of a second without actually running the emulator itself, though it tends to be
+OS-dependent right now.
+</a><p>
+<a name="snes9x.wait">If you're not sufficiently confused yet, think of this as pausing for one frame.
+</a></p><p>
+<a name="snes9x.wait">If you need to do a large amount of calculations -- so much that you risk setting off the rampant script
+warning, just call this function every once in a while.
+</a></p><p>
+<a name="snes9x.wait">Might want to avoid using this if you don't need to. If the emulator is running at normal speed, paused
+and the user presses frame-advance, they might be confused when nothing happens.
+</a></p><p>
+</p><h1><a name="snes9x.wait">memory</a></h1>
+
+<a name="snes9x.wait">Memory access and manipulation.
+
+</a><pre><a name="snes9x.wait">int memory.readbyte(int address)
+int memory.readword(int address)</a></pre>
+<a name="snes9x.wait">Reads a number of bits (8 or 16) and returns the memory contents. The address must be a fully qualified
+memory address. The RAM range is 0x7e0000 through 0x7fffff, but you may use any memory address, including
+the ROM data itself.
+</a><hr>
+<pre><a name="snes9x.wait">int memory.readbytesigned(int address)
+int memory.readwordsigned(int address)</a></pre><a name="snes9x.wait"><font color="#800000">v0.04+ only</font><br>
+
+Same as its counterparts, except numbers will be treated as signed. Numbers larger than 127 for bytes and
+32767 for words will be translated into the correct negative numbers. For reference, an alternate formula
+is to subtract 256 for bytes and 65536 for words from any number equal to or larger than half that number.
+For example, a byte at 250 becomes <tt>250-256 = -6</tt>.
+</a><hr>
+
+<pre><a name="snes9x.wait">memory.writebyte(int address, int value)
+memory.writebyte(int address, int value)</a></pre>
+
+<a name="snes9x.wait">Writes a number of bits (8 or 16) to the indicated memory address. The address MUST be in the range of
+0x7e0000 through 0x7fffff.
+
+</a><hr>
+<pre><a name="snes9x.wait">memory.register(int address, function func)</a></pre>
+<a name="snes9x.wait">When the given memory address is written to (range must be 0x7e0000 to 0x7fffff), the given function will be
+called. The execution of the CPU will be paused mid-frame to call the given function.
+</a><p>
+<a name="snes9x.wait">Only one function can be registered with a memory address. 16 bit writes will only trigger the lower address
+listener. There is no distinction between 8 and 16 bit writes. <tt>func</tt> may be nil in order to
+delete a function from listening.
+</a></p><p>
+<a name="snes9x.wait">Code called may not call snes9x.frameadvance() or any savestate save/load functions, and any button
+manipulation results are undefined. Those actions are only meaningful at frame boundaries.
+
+
+</a></p><p>
+</p><h1><a name="snes9x.wait">joypad</a></h1>
+
+<a name="snes9x.wait">Access to the gamepads. Note that Lua makes some joysticks do strange things.
+Setting joypad inputs causes the user input for that frame to be ignored, but
+only for that one frame.
+</a><p>
+<a name="snes9x.wait">Joypads are numbered 1 to 5.
+</a></p><p>
+<a name="snes9x.wait">Joypad buttons are selected by use of a table with special keys. The table
+has keys start, select, up, down, left, right, A, B, X, Y, L, R. Note the
+case is sensetive. Buttons that are pressed are set to a non-nil value
+(use of the integer 1 is just a convention). Note that "false" is valid,
+but discouraged as testing for logical true will fail.
+</a></p><p>
+<a name="snes9x.wait">Currently reading input from a movie file is not possible, but
+a movie will record button presses from Lua.
+</a></p><pre><a name="snes9x.wait">table joypad.read(int which)</a></pre>
+<a name="snes9x.wait">Returns a table indicating which buttons are pressed <i>by the user</i>.
+This is probably the only way to get input to the script by the user.
+This is always user input, even if the joypads have been set by joypad.set.
+</a><hr>
+<pre><a name="snes9x.wait">joypad.set(int which, table buttons)</a></pre>
+<a name="snes9x.wait">Sets the buttons to be pressed. These choices will be made in place of
+what the user is pressing during the next frame advance; they are then
+discarded, so this must be called once every frame, even if you just want to
+keep the same buttons pressed for several frames.
+
+</a><p>
+</p><h1><a name="snes9x.wait">savestate</a></h1>
+<a name="snes9x.wait">Control over the savestate process. Savestate objects are opaque structures
+that represent non-player accessible states (except for the functions that
+return "official" savesates). Such an object is garbage collectable, in which
+case the savestate is no longer usable. Recycling of existing savestate objects
+is highly recommended for disk space concerns lest the garbage collector
+grow lazy.</a><p>
+<a name="snes9x.wait">Each object is basically a savestate file. Anonymous savestates are saved to
+your temp directory.
+
+</a></p><hr>
+<pre><a name="snes9x.wait">object savestate.create(int userslot=nil)</a></pre>
+<a name="snes9x.wait">Creates a savestate object for use. If the userslot argument
+is given, the state will be accessible via the associated
+F-key (F1 through F12 are 1 through 12). If not specified or
+nil, an anonymous savestate is created that only Lua can access.
+</a><p>
+<a name="snes9x.wait">Each call to <tt>savestate.create()</tt> (without parameters) returns
+a unique savestate. As such, if you discard the contents of a variable
+containing an important savestate, you've just shot yourself in the foot.
+</a></p><p>
+<a name="snes9x.wait">An object may be used freely once created, saved over and loaded whenever.
+</a></p><p>
+<a name="snes9x.wait">It is an error to load an anonymous (non-player accessbile) state that
+has not been saved yet, since it's empty.
+</a></p><p>
+<a name="snes9x.wait">Each savestate uses about 120 KB of disk space and the random filename generator
+has its limits with regards to how many filenames it can generate. Don't go too
+overboard. If you need more than 1000 savestates, maybe you should rethink
+your tehcnique. <small>(The actual windows limit is about 32768, Linux is higher).</small>
+</a></p><hr><pre><a name="snes9x.wait">savestate.save(object state)</a></pre>
+<a name="snes9x.wait">Saves the current state to the given object. Causes an error if something goes horribly
+wrong, or if called within any "registered" callback function.
+</a><hr><pre><a name="snes9x.wait">savestate.load(object state)</a></pre>
+<a name="snes9x.wait">Loads the given state. Throws an error for all the same bad things that might happen.
+
+</a><hr>
+<pre><a name="snes9x.wait">function savestate.registersave(function save)</a></pre>
+<a name="snes9x.wait"><font color="#800000">v0.06+ only</font><br>
+Registers a function to be called on a savestate event. This includes both calls to
+<tt>savestate.save()</tt> and users pressing buttons. The function will be called without
+parameters.
+</a><p>
+<a name="snes9x.wait">The function called is permitted to return any number of string and number values.
+Lua lets you do this by simply writing <tt>return 1, 2, 3, "four and five", 6.2, integerVar</tt>
+</a></p><p>
+<a name="snes9x.wait">These variables must be numeric or string. They will be saved into the savestate itself
+and returned back to the application via savestate.registerload() should the state ever be loaded
+again later.
+</a></p><p>
+<a name="snes9x.wait">Only one function can be registered. Registering a second function will cause the first function
+to be returned back by savestate.registersave() before being discarded.
+</a></p><p>
+<a name="snes9x.wait">Savestates created with this mechanism are likely to break some savestate features in other emulators.
+Don't be surprised if savestates from this version don't work on others if you enable all those
+fancy features. Compatible savestates are created if there is no registered save function, or if
+the save function returns no parameters at all.
+</a></p><hr>
+<pre><a name="snes9x.wait">function savestate.registerload(function load)</a></pre>
+<a name="snes9x.wait"><font color="#800000">v0.06+ only</font><br>
+The companion to savestate.registersave, this function registers a function to be called during
+a load. The function will be passed parameters -- exactly those returned by the function
+registered for a save. If the savestate contains no saved data for your script, the function
+will be called without parameters.
+</a><p>
+<a name="snes9x.wait">Concept code:
+<font color="#000080">
+<pre>function saveState() .... end
+function loadState(arg1, arg2, ...) ... end
+
+savestate.registersave(saveState)
+savestate.registerload(loadState)
+
+
+-- Behind the scenes
+local saved_variables
+
+
+
+-- User presses savestate
+saved_variables = { saveState() } -- All return values saved
+
+
+-- Time passes
+-- ...
+
+
+-- User presses loadstate
+loadState(unpack(saved_variables))
+
+</pre></font>
+</a></p><h2><a name="snes9x.wait">Recommendations for registered savestates</a></h2>
+<ul><li><a name="snes9x.wait">You may want to reserve the first parameter as a sort of key to make sure that the right
+script or the correct revision of a script is receiving its data.</a></li>
+<li><a name="snes9x.wait">The order of declarations in your script should be local variables, then functions (inclduing these),
+then any registration calls, and then your main loop. Code is executed top-to-bottom, and that includes
+variables being prepared as locals, functions created, etc.</a></li>
+<li><a name="snes9x.wait">You can build a string using string.char() from bytes and disassemble one from string.byte().
+If you need to store a table's worth of data, you'll have to do this.</a></li>
+</ul>
+<p>
+</p><h1><a name="snes9x.wait">movie</a></h1>
+<a name="snes9x.wait">Access to movie information.
+<br>
+</a><hr>
+<pre><a name="snes9x.wait">int movie.framecount()</a></pre>
+<a name="snes9x.wait">Returns the current frame count, or nil if no movie is running.
+</a><hr>
+<pre><a name="snes9x.wait">string movie.mode()</a></pre>
+<a name="snes9x.wait">Returns "record", "playback", or nil, depending on the current movie.
+</a><hr>
+<pre><a name="snes9x.wait">movie.rerecordcounting(boolean counting)</a></pre>
+<a name="snes9x.wait">Select whether rerecording should be counted. If set to false, you can do
+all the brute force work you want without inflating the rerecord count.
+</a><p>
+<a name="snes9x.wait">This will automatically be set to true after a script finishes running, so
+don't worry about messing up the rerecord count for the player.
+</a></p><hr>
+<pre><a name="snes9x.wait">movie.stop()</a></pre>
+<a name="snes9x.wait">Stops movie recording/playback. I'm not sure why you'd want to do that, but you can.
+</a><hr>
+<!--
+<pre>movie.record(string filename, boolean resetFirst = true, int players = 1)</pre>
+Starts recording a movie with the indicated filename. Will throw an error if the file
+already exists
+<hr>
+<pre>movie.playback(string filename, boolean readonly = true)</pre>
+Starts playback of the given movie file. It is an error if the file does not already exist, is invalid, etc.
+Specify readonly=true if savestates should be used for seeking during playback. If readonly=false,
+loading a savestate will switch into recording mode.
+<p>
+The selection of readonly mode might affect the ability to open a file. readonly=true would successfully
+play a file on a read-only medium (such as a CD) but readonly=false would fail.
+-->
+<h1><a name="snes9x.wait">gui</a></h1>
+<a name="snes9x.wait"><font color="#800000">0.03+ only</font><br>
+The ability to draw on the surface of the screen is a long sought feature. The surface is 256x239 pixels
+(256x224 most of the time though) with (0,0) being in the top-left corner of the screen.
+</a><p>
+<a name="snes9x.wait">The SNES uses a 16 bit colour system. Red and blue both use 5 bits (0 through 31) while green uses
+6 bits (0 through 63), in place of the usual 0 to 255 range. If you want to construct your own exact colours,
+multiply your red value by 2048, your green value by 32 and leave your blue value untouched. Add these all
+together to get a valid colour. Bright red would be <tt>31*2048 = 63488</tt>, for example.
+</a></p><p>
+<a name="snes9x.wait">Some strings are accepted. HTML style encoding such as "#00ff00" for green is accepted. Some simple strings such
+as "red", "green", "blue", "white" and "black" are also accepted.
+</a></p><p>
+<a name="snes9x.wait">The transparent colour is 1 (a VERY dark blue, which is probably not worth using in place of black) or the string
+"clear". Remove drawn elements using this colour.
+</a></p><p>
+<a name="snes9x.wait">Output is delayed by a frame. The graphics are drawn on a separate buffer and then overlayed on the image
+during the next refresh, which means allowing for a frame to execute. Also, the buffer is cleared after drawing,
+so if you want to keep something on screen, you must keep drawing it on each frame.
+</a></p><p>
+<a name="snes9x.wait">It is an error to draw outside the drawing area. gdoverlay is the only exception to this rule - images will
+be clipped to the visible area.
+</a></p><hr>
+<pre><a name="snes9x.wait">r,g,b = gui.getpixel(int x, int y)</a></pre>
+<a name="snes9x.wait">Returns the pixel on the indicated coordinate. (0,0) is the top-left corner and (255, 223) is the typical bottom-right corner,
+though (255,238) is allowed. The return value range is (0,0,0) to (31,63,31). You get the actual screen surface before
+any damage is done by your drawing. Well, unless you call snes9x.wait() in which case your damage is applied and the SNES
+hardware doesn't get a chance to draw a new frame. :)
+</a><hr>
+<pre><a name="snes9x.wait">gui.drawpixel(int x, int y, type colour)</a></pre>
+<a name="snes9x.wait">Draw a single pixel on the screen.
+</a><hr>
+<pre><a name="snes9x.wait">gui.drawline(int x1, int y1, int x2, int y2, type colour)</a></pre>
+<a name="snes9x.wait">Draw a line between the two indicated positions.
+</a><hr>
+<pre><a name="snes9x.wait">gui.drawbox(int x1, int y1, int x2, int y2, type colour)</a></pre>
+<a name="snes9x.wait">Draw a box going through the indicated opposite corners.
+</a><hr>
+<pre><a name="snes9x.wait">gui.text(int x, int y, string message)</a></pre>
+<a name="snes9x.wait">Write text on the screen at the indicated position.
+</a><p>
+<a name="snes9x.wait">The coordinates determine the top-left corner of the box that the text fits in.
+The font is the same one as the snes9x messages, and you can't control colours or anything. <tt>:(</tt>
+</a></p><p>
+<a name="snes9x.wait">The minimum y value is 9 for the font's height and each letter will take around 8 pixels of width.
+Text that exceeds the viewing area will be cut short, so ensuring your text will fit would be wise.
+
+</a></p><hr>
+<pre><a name="snes9x.wait">string gui.gdscreenshot()</a></pre>
+<a name="snes9x.wait"><font color="#800000">0.04+ only</font><br>
+Takes a screen shot of the image and returns it in the form of a string which can be imported by
+the </a><a href="http://lua-gd.luaforge.net/">gd library</a> using the gd.createFromGdStr() function.
+<p>
+This function is provided so as to allow snes9x to not carry a copy of the gd library itself. If you
+want raw RGB32 access, skip the first 11 bytes (header) and then read pixels as Alpha (always 0), Red,
+Green, Blue, left to right then top to bottom, range is 0-255 for all colours.
+</p><p>
+Warning: Storing screen shots in memory is not recommended. Memory usage will blow up pretty quick.
+One screen shot string eats around 230 KB of RAM.
+
+</p><hr>
+<pre>gui.gdoverlay(int x=0, int y=0, string gdimage)</pre>
+<font color="#800000">0.04+ only</font><br>
+Overlays the given image on top of the screen with the top-left corner in the given screen location.
+Transparency is not fully supported -- a pixel must be 100% transparent to actually leave
+a hole in the overlayed image or else it will be treated as opaque.<p>
+Naturally, the format for input is the gd file format, version 1. The image MUST be truecolour.
+
+</p><p>
+The image will be clipped to fit into the screen area.
+</p><hr>
+<pre>gui.transparency(int strength)</pre>
+<font color="#800000">0.04+ only</font><br>
+Transparency mode. A value of 0 means opaque; a value of 4 means invisible (useful option that one).
+As for 1 through 3, I'll let you figure those out.<p>
+All image drawing (including gui.gdoverlay) will have the given transparency level applied from that point
+on. Note that drawing on the same point over and over will NOT build up to a higher opacity level.
+</p><hr>
+<pre>function gui.register(function func)</pre>
+<font color="#800000">0.04+ only</font><br>
+Register a function to be called between a frame being prepared for displaying on your screen and
+it actually happening. Used when that 1 frame delay for rendering is a pain in the butt.
+<p>This function is not necessarily complicated to use, but it's not recommended to users
+new to the whole scripting thing.
+</p><p>
+You may pass nil as the parameter to kill off a registered function. The old function (if any) will be
+returned.
+</p><hr>
+<pre>string function gui.popup(string message, [string type = "ok"])</pre>
+<font color="#800000">v0.05+ only</font><br>
+Pops up a dialog to the user with a message, and returns after the user acknowledges the dialog.
+<tt>type</tt> may be any of "ok", "yesno", "yesnocancel". The return value will be "yes", "no" or "cancel"
+as the case may be. "ok" is a waste of effort.
+<p>
+Linux users might want to install xmessage to perform the work. Otherwise the dialog will
+appear on the shell and that's less noticable.
+</p></body></html> \ No newline at end of file
diff --git a/documentation/tech/cpu/4017.txt b/documentation/tech/cpu/4017.txt
new file mode 100644
index 00000000..25cb8393
--- /dev/null
+++ b/documentation/tech/cpu/4017.txt
@@ -0,0 +1,97 @@
+This is an email posted to nesdev by Ki a while back. I have removed one
+line at the end regarding the B flag of the cpu(the information was
+incorrect, which Ki noted in a later email).
+
+--------------------------------------------------------------------------------
+
+ By reading Brad's NESSOUND document, we know that there is a
+"frame counter" in the NES/FC APU. I would like to post
+some more on this.
+
+ The frame counter is reset upon any write to $4017. It is
+reset at system power-on as well, but is NOT reset upon
+system reset.
+
+ Thanks to Samus Aran, we now know the exact period of the
+PPU's single frame. In another words, we are now sure that
+the NMI occurs on every 29780 2/3 CPU cycles.
+
+ However, the APU's single frame is NOT 29780 2/3 CPU cycles.
+What I mean by "APU's single frame" here is that it is the
+number of CPU cycles taken between the frame IRQs.
+
+ The APU's single frame seems to be:
+
+ 1789772.727... / 60 = 29829 6/11 [CPU CYCLE]
+
+ Below is a simple diagram which shows the difference
+in periods of the PPU's single frame and the APU's.
+
+
+ RESET 29780 2/3 CPU CYCLES NMI
+PPU |------------------------------------------|
+ | 29829 6/11 CPU CYCLES IRQ
+APU |----------|----------|----------|----------|
+
+
+ Note that if you write $00 to $4017 on every NMI, the frame
+IRQ would NEVER go off even if it is enabled. This is because
+the the period of NMI is slightly shorter than the period of
+the frame IRQ. This causes the frame counter to be reset
+before the frame IRQ goes off.
+
+When you write zero to bit 7 of $4017, the frame counter will
+be reset, and the first sound update will be done after 7457 CPU
+cycles (i.e. 29829/4). 2nd update will be done 7457 after that,
+same goes for 3rd update and 4th update, but the frame IRQ occurs
+on 4th update, resetting the frame counter as well.
+
+When you write 1 to bit 7 of $4017, the frame counter will be
+reset, but the first sound update will occur at the same time.
+2nd, 3rd, and 4th update will be done after 7457, 14914, 22371
+CPU cycles after the first update respectively, but the 5th
+update will be 14914 cycles after the 4th update. This causes
+sound output to last 1.25 times longer than that of bit 7 = 0.
+
+
+$4017W:
+
+o when the MSB of $4017 is 0:
+
+bit7=0
+ |---------|---------|---------|---------|---------|---------|----
+ 1st 2nd 3rd 4th 5th(1st) 6th(2nd)
+
+
+o when the MSB of $4017 is 1:
+
+bit7=1
+ |---------|---------|---------|-------------------|---------|----
+ 1st 2nd 3rd 4th 5th(1st) 6th(2nd)
+
+
+On 1st, 3rd, 5th, ... updates, the envelope decay and the
+linear counter are updated.
+
+On 2nd, 4th, 6th, ... updates, the envelope decay, the
+linear counter, the length counter, and the frequency sweep
+are updated.
+----
+
+ The original info was provided by goroh, and verified by me.
+However, it could still be wrong. Please tell me if you
+find anything wrong.
+----
+
+(Correction from my last posting)
+
+ I have checked once again and it turned out that the frame IRQ
+was NOT disabled upon system reset. What actually prevented the
+frame IRQ to occur after system reset was, in fact, the I flag.
+I checked this flag shortly after system reset (right after stack
+pointer was initialized), and the flag was 1, although I never
+executed "sei" after reset. Therefore the I flag of the PR2A03G
+is 1 on system reset.
+
+ Thanks Matthew Conte and Samus Aran for pointing out the
+inaccuracy.
diff --git a/documentation/tech/cpu/dmc.txt b/documentation/tech/cpu/dmc.txt
new file mode 100644
index 00000000..c33f4de8
--- /dev/null
+++ b/documentation/tech/cpu/dmc.txt
@@ -0,0 +1,235 @@
+Delta modulation channel tutorial 1.0
+Written by Brad Taylor
+
+Last updated: August 20th, 2000.
+
+All results were obtained by studying prior information available (from
+nestech 1.00, and postings on NESDev from miscellanious people), and through
+a series of experiments conducted by me. Results aquired by individuals
+prior to my reverse-engineering have been double checked, and final results
+have been confirmed. Credit is due to those individual(s) who contributed
+any information in regards to the DMC.
+
+Description
+-----------
+
+The delta modulation channel (DMC) is a complex digital network of counters
+and registers used to produce analog sound. It's primary function is to play
+"samples" from memory, and have an internal counter connected to a digital
+to analog converter (DAC) updated accordingly. The channel is able to be
+assigned a pointer to a chunk of memory to be played. At timed intervals,
+the DMC will halt the 2A03 (NES's CPU) for 1 clock cycle to retrieve the
+sample to pe played. This method of playback will be refered to here on as
+direct memory access (DMA). Another method of playback known as pulse code
+modulation (PCM) is available by the channel, which requires the constant
+updating of one of the DMC's memory-mapped registers.
+
+Registers
+---------
+
+The DMC has 5 registers assigned to it. They are as follows:
+
+$4010: play mode and DMA frequency
+$4011: delta counter
+$4012: play code's starting address
+$4013: length of play code
+$4015: DMC/IRQ status
+
+Note that $4015 is the only R/W register. All others are write only (attempt
+to read them will most likely result in a returned 040H, due to heavy
+capacitance on the NES's data bus).
+
+$4010 - Play mode and DMA frequency
+-----------------------------------
+This register is used to control the frequency of the DMA fetches, and to
+control the playback mode.
+
+Bits
+----
+6-7 this is the playback mode.
+
+ 00 - play DMC sample until length counter reaches 0 (see $4013)
+ x1 - loop the DMC sample (x = immaterial)
+ 10 - play DMC sample until length counter reaches 0, then generate a CPU
+IRQ
+
+Looping (playback mode "x1") will have the chunk of memory played over and
+over, until the channel is disabled (via $4015). In this case, after the
+length counter reaches 0, it will be reloaded with the calculated length
+value of $4013.
+
+If playback mode "10" is chosen, an interrupt will be dispached when the
+length counter reaches 0 (after the sample is done playing). There are 2
+ways to acknowledge the DMC's interrupt request upon recieving it. The first
+is a write to this register ($4010), with the MSB (bit 7) cleared (0). The
+second is any write to $4015 (see the $4015 register description for more
+details).
+
+If playback mode "00" is chosen, the sample plays until the length counter
+reaches 0. No interrupt is generated.
+
+5-4 appear to be unused
+
+3-0 this is the DMC frequency control. Valid values are from 0 - F. The
+value of this register determines how many CPU clocks to wait before the DMA
+will fetch another byte from memory. The # of clocks to wait -1 is initially
+loaded into an internal 12-bit down counter. The down counter is then
+decremented at the frequency of the CPU (1.79MHz). The channel fetches the
+next DMC sample byte when the count reaches 0, and then reloads the count.
+This process repeats until the channel is disabled by $4015, or when the
+length counter has reached 0 (if not in the looping playback mode). The
+exact number of CPU clock cycles is as follows:
+
+value CPU
+written clocks octave scale
+------- ------ ------ -----
+F 1B0 8 C
+E 240 7 G
+D 2A0 7 E
+C 350 7 C
+B 400 6 A
+A 470 6 G
+9 500 6 F
+8 5F0 6 D
+7 6B0 6 C
+6 710 5 B
+5 7F0 5 A
+4 8F0 5 G
+3 A00 5 F
+2 AA0 5 E
+1 BE0 5 D
+0 D60 5 C
+
+The octave and scale values shown represent the DMC DMA clock cycle rate
+equivelant. These values are merely shown for the music enthusiast
+programmer, who is more familiar with notes than clock cycles.
+
+Every fetched byte is loaded into a internal 8-bit shift register. The shift
+register is then clocked at 8x the DMA frequency (which means that the CPU
+clock count would be 1/8th that of the DMA clock count), or shifted at +3
+the octave of the DMA (same scale). The data shifted out of the register is
+in serial form, and the least significant bit (LSB, or bit 0) of the fetched
+byte is the first one to be shifted out (then bit 1, bit 2, etc.).
+
+The bits shifted out are then fed to the UP/DOWN control pin of the internal
+delta counter, which will effectively have the counter increment it's
+retained value by one on "1" bit samples, and decrement it's value by one on
+"0" bit samples. This counter is clocked at the same frequency of the shift
+register's.
+
+The counter is only 6 bits in size, and has it's 6 outputs tied to the 6 MSB
+inputs of a 7 bit DAC. The analog output of the DAC is then what you hear
+being played by the DMC.
+
+Wrap around counting is not allowed on this counter. Instead, a "clipping"
+behaviour is exhibited. If the internal value of the counter has reached 0,
+and the next bit sample is a 0 (instructing a decrement), the counter will
+take no action. Likewise, if the counter's value is currently at -1
+(111111B, or 03FH), and the bit sample to be played is a 1, the counter will
+not increment.
+
+
+$4011 - Delta counter load register
+-----------------------------------
+
+bits
+----
+7 appears to be unused
+1-6 the load inputs of the internal delta counter
+0 LSB of the DAC
+
+A write to this register effectively loads the internal delta counter with a
+6 bit value, but can be used for 7 bit PCM playback. Bit 0 is connected
+directly to the LSB (bit 0) of the DAC, and has no effect on the internal
+delta counter. Bit 7 appears to be unused.
+
+This register can be used to output direct 7-bit digital PCM data to the
+DMC's audio output. To use this register for PCM playback, the programmer
+would be responsible for making sure that this register is updated at a
+constant rate. The rate is completely user-definable. For the regular CD
+quality 44100 Hz playback sample rate, this register would have to be
+written to approximately every 40 CPU cycles (assuming the 2A03 is running @
+1.79 MHz).
+
+
+$4012 - DMA address load register
+----------------------------
+
+This register contains the initial address where the DMC is to fetch samples
+from memory for playback. The effective address value is $4012 shl 6 or
+0C000H. This register is connected to the load pins of the internal DMA
+address pointer register (counter). The counter is incremented after every
+DMA byte fetch. The counter is 15 bits in size, and has addresses wrap
+around from $FFFF to $8000 (not $C000, as you might have guessed). The DMA
+address pointer register is reloaded with the initial calculated address,
+when the DMC is activated from an inactive state, or when the length counter
+has arrived at terminal count (count=0), if in the looping playback mode.
+
+
+$4013 - DMA length register
+---------------------------
+
+This register contains the length of the chunk of memory to be played by the
+DMC, and it's size is measured in bytes. The value of $4013 shl 4 is loaded
+into a 12 bit internal down counter, dubbed the length counter. The length
+counter is decremented after every DMA fetch, and when it arrives at 0, the
+DMC will take action(s) based on the 2 MSB of $4010. This counter will be
+loaded with the current calculated address value of $4013 when the DMC is
+activated from an inactive state. Because the value that is loaded by the
+length counter is $4013 shl 4, this effectively produces a calculated byte
+sample length of $4013 shl 4 + 1 (i.e. if $4013=0, sample length is 1 byte
+long; if $4013=FF, sample length is $FF1 bytes long).
+
+
+$4015 - DMC status
+------------------
+
+This contains the current status of the DMC channel. There are 2 read bits,
+and 1 write bit.
+
+bits
+----
+7(R) DMC's IRQ status (1=CPU IRQ being caused by DMC)
+4(R) DMC is currently enabled (playing a stream of samples)
+4(W) enable/disable DMC (1=start/continue playing a sample;0=stop playing)
+
+When an IRQ goes off inside the 2A03, Bit 7 of $4015 can tell the interrupt
+handler if it was caused by the DMC hardware or not. This bit will be set
+(1) if the DMC is responsible for the IRQ. Of course, if your program has no
+other IRQ-generating hardware going while it's using the DMC, then reading
+this register is not neccessary upon IRQ generation. Note that reading this
+register will NOT clear bit 7 (meaning that the DMC's IRQ will still NOT be
+acknowledged). Also note that if the 2 MSB of $4010 were set to 10, no IRQ
+will be generated, and bit 7 will always be 0.
+
+Upon generation of a IRQ, to let the DMC know that the software has
+acknowledged the /IRQ (and to reset the DMC's internal IRQ flag), any write
+out to $4015 will reset the flag, or a write out to $4010 with the MSB set
+to 0 will do. These practices should be performed inside the IRQ handler
+routine. To replay the same sample that just finished, all you need to do is
+just write a 1 out to bit 4 of $4015.
+
+Bit 4 of $4015 reports the real-time status of the DMC. A returned value of
+1 denotes that the channel is currently playing a stream of samples. A
+returned value of 0 indicates that the channel is inactive. If the
+programmer needed to know when a stream of samples was finished playing, but
+didn't want to use the IRQ generation feature of the DMC, then polling this
+bit would be a valid option.
+
+Writing a value to $4015's 4th bit has the effect of enabling the channel
+(start, or continue playing a stream of samples), or disabling the channel
+(stop all DMC activity). Note that writing a 1 to this bit while the channel
+is currently enabled, will have no effect on counters or registers internal
+to the DMC.
+
+The conditions that control the time the DMC will stay enabled are
+determined by the 2 MSB of $4010, and register $4013 (if applicable).
+
+
+System Reset
+------------
+
+On system reset, all 7 used bits of $4011 are reset to 0, the IRQ flag is
+cleared (disabled), and the channel is disabled. All other registers will
+remain unmodified.
+
diff --git a/documentation/tech/cpu/nessound-4th.txt b/documentation/tech/cpu/nessound-4th.txt
new file mode 100644
index 00000000..c592d2ed
--- /dev/null
+++ b/documentation/tech/cpu/nessound-4th.txt
@@ -0,0 +1,551 @@
+*******************************************
+*2A03 sound channel hardware documentation*
+*******************************************
+Brad Taylor (big_time_software@hotmail.com)
+
+ 4th release: February 19th, 2K3
+
+
+ All results were obtained by studying prior information available (from nestech 1.00, and postings on NESDev from miscellanious people), and through a series of experiments conducted by me. Results acquired by individuals prior to my reverse-engineering have been double checked, and final results have been confirmed. Credit is due to those individual(s) who contributed miscellanious information in regards to NES sound channel hardware. Such individuals are:
+
+ Goroh
+ Memblers
+ FluBBa
+ Izumi
+ Chibi-Tech
+ Quietust
+ SnowBro
+
+ Kentaro Ishihara (Ki) is responsible for posting (on the NESdev mailing list) differrences in the 2 square wave channels, including the operation of 2A03 hardware publically undocumented (until now) such as the frame IRQ counter, and it's ties with sound hardware. Goroh had originally discovered some of this information, and Ki confirmed it.
+
+ A special thanks goes out to Matthew Conte, for his expertise on pseudo-random number generation (amoung other things), which allowed for the full reverse engineering of the NES's noise channel to take place. Without his help, I would still be trying to find a needle in a haystack, as far as the noise's method of pseudo-random number generation goes. Additionally, his previous findings / reverse engineering work on the NES's sound hardware really got the ball of NES sound emulation rolling. If it weren't for Matt's original work, this document wouldn't exist.
+
+
+****************
+* Introduction *
+****************
+ The 2A03 (NES's integrated CPU) has 4 internal channels to it that have the ability to generate semi-analog sound, for musical playback purposes. These channels are 2 square wave channels, one triangle wave channel, and a noise generation channel. This document will go into full detail on every aspect of the operation and timing of the mentioned sound channels.
+
+
+*******************
+* Channel details *
+*******************
+ Each channel has different characteristics to it that make up it's operation.
+
+ The square channel(s) have the ability to generate a square wave frequency in the range of 54.6 Hz to 12.4 KHz. It's key features are frequency sweep abilities, and output duty cycle adjustment.
+
+ The triangle wave channel has the ability to generate an output triangle wave with a resolution of 4-bits (16 steps), in the range of 27.3 Hz to 55.9 KHz. The key features this channel has is it's analog triangle wave output, and it's linear counter, which can be set to automatically disable the channel's sound after a certain period of time has gone by.
+
+ The noise channel is used for producing random frequencys, which results in a "noisey" sounding output. Output frequencys can range anywhere from 29.3 Hz to 447 KHz. It's key feature is it's pseudo- random number generator, which generates the random output frequencys heard by the channel.
+
+
+*****************
+* Frame counter *
+*****************
+ The 2A03 has an internal frame counter. The purpose of it is to generate the various low frequency signals (60, 120, 240 Hz, and 48, 96, 192 Hz) required to clock several of the sound hardware's counters. It also has the ability to generate IRQ's.
+
+ The smallest unit of timing the frame counter operates around is 240Hz; all other frequencies are generated by multiples of this base frequency. A clock divider of 14915 (clocked at twice the CPU speed) is used to get 240Hz (this was the actual measured ratio).
+
+
++---------------+
+|$4017 operation|
++---------------+
+ Writes to register $4017 control operation of both the clock divider, and the frame counter.
+
+ - Any write to $4017 resets both the frame counter, and the clock divider. Sometimes, games will write to this register in order to synchronize the sound hardware's internal timing, to the sound routine's timing (usually tied into the NMI code). The frame IRQ is slightly longer than the PPU's, so you can see why games would desire this syncronization.
+
+ - bit 7 of $4017 controls the frame counter's divide rate. Every time the counter cycles (reaches terminal count (0)), a frame IRQ will be generated, if enabled by clearing bit 6 of $4017. $4015.6 holds the status of the frame counter IRQ; it will be set if the frame counter is responsible for the interrupt.
+
+$4017.7 divider frame IRQ freq.
+------- ------- ---------------
+0 4 60
+1 5 48
+
+ On 2A03 reset, both bits of $4017 (6 & 7) will be cleared, enabling frame IRQ's off the hop. The reason why the existence of frame IRQ's are generally unknown is because the 6502's maskable interrupt is disabled on reset, and this blocks out the frame IRQ's. Most games don't use any IRQ-generating hardware in general, therefore they don't bother enabling maskable interrupts.
+
+ Note that the IRQ line will be held down by the frame counter until it is acknowledged (by reading $4015). Before this, the 6502 will generate an IRQ *every* time interrupts are enabled (either by CLI or RTI), since the IRQ design on the 6502 is level-triggered, and not edge. If you've written a program that does not read $4015 in the IRQ handler, and you execute CLI, the processor will immediately go into a infinite IRQ call-return loop.
+
+
++-----------------------+
+|Frame counter operation|
++-----------------------+
+ Depending on the status of $4017.7, the frame counter will follow 2 different count sequences. These sequences determine when sound hardware counters will be clocked. The sequences are initialized immediately following any write to $4017.
+
+$4017.7 sequence
+------- --------
+0 4, 0,1,2,3, 0,1,2,3,..., etc.
+1 0,1,2,3,4, 0,1,2,3,4,..., etc.
+
+ During count sequences 0..3, the linear (triangle) and envelope decay (square & noise) counters recieve a clock for each count. This means that both these counters are clocked once immediately after $4017.7 is written with a value of 1.
+
+ Count sequences 1 & 3 clock (update) the frequency sweep (square), and length (all channels) counters. Even though the length counter's smallest unit of time counting is a frame, it seems that it is actually being clocked twice per frame. That said, you can consider the length counters to contain an extra stage to divide this clock signal by 2.
+
+ No aforementioned sound hardware counters are clocked on count sequence #4. You should now see how this causes the 96, and 192 Hz signals to be generated when $4017.7=1.
+
+ The rest of the document will describe the operation of the sound channels using the $4017.7=0 frequencies (60, 120, and 240 Hz). For $4017.7=1 operation, replace those frequencies with 48, 96, and 192 Hz (respectively).
+
+
+************************
+* Sound hardware delay *
+************************
+ After resetting the 2A03, the first time any sound channel(s) length counter contains a non-zero value (channel is enabled), there will be a 2048 CPU clock cycle delay before any of the sound hardware is clocked. After the 2K clock cycles go by, the NES sound hardware will be clocked normally. This phenomenon only occurs prior to a system reset, and only occurs during the first 2048 CPU clocks after the activation of any of the 4 basic sound channels.
+
+ The information in regards to this delay is only provided to keep this document accurate with all information that is currently known about the 2A03's sound hardware. I haven't done much tests on the behaviour of this delay (mainly because I don't care, as I view it as a inconvenience anyway), so this information should be taken with a grain of salt.
+
+
+************************
+* Register Assignments *
+************************
+ The sound hardware internal to the 2A03 has been designated these special memory addresses in the CPU's memory map.
+
+$4000-$4003 Square wave 1
+$4004-$4007 Square wave 2 (identical to the first, except for upward frequency sweeps (see "sweep unit" section))
+$4008-$400B Triangle
+$400C-$400F Noise
+$4015 Channel enable / length/frame counter status
+$4017 frame counter control
+
+ Note that $4015 (and $4017, but is unrelated to sound hardware) are the only R/W registers. All others are write only (attempt to read them will most likely return the last byte on the bus (usually 040H), due to heavy capacitance on the NES's data bus). Reading a "write only" register, will have no effect on the specific register, or channel.
+
+ Every sound channel has 4 registers affiliated with it. The description of the register sets are as follows:
+
++----------------+
+| Register set 1 |
++----------------+
+
+$4000(sq1)/$4004(sq2)/$400C(noise) bits
+---------------------------------------
+0-3 volume / envelope decay rate
+4 envelope decay disable
+5 length counter clock disable / envelope decay looping enable
+6-7 duty cycle type (unused on noise channel)
+
+$4008(tri) bits
+---------------
+0-6 linear counter load register
+7 length counter clock disable / linear counter start
+
+
++----------------+
+| Register set 2 |
++----------------+
+
+$4001(sq1)/$4005(sq2) bits
+--------------------------
+0-2 right shift amount
+3 decrease / increase (1/0) wavelength
+4-6 sweep update rate
+7 sweep enable
+
+$4009(tri)/$400D(noise) bits
+----------------------------
+0-7 unused
+
+
++----------------+
+| Register set 3 |
++----------------+
+
+$4002(sq1)/$4006(sq2)/$400A(Tri) bits
+-------------------------------------
+0-7 8 LSB of wavelength
+
+$400E(noise) bits
+-----------------
+0-3 playback sample rate
+4-6 unused
+7 random number type generation
+
+
++----------------+
+| Register set 4 |
++----------------+
+
+$4003(sq1)/$4007(sq2)/$400B(tri)/$400F(noise) bits
+--------------------------------------------------
+0-2 3 MS bits of wavelength (unused on noise channel)
+3-7 length counter load register
+
+
++--------------------------------+
+| length counter status register |
++--------------------------------+
+
+$4015(read)
+-----------
+0 square wave channel 1
+1 square wave channel 2
+2 triangle wave channel
+3 noise channel
+4 DMC (see "DMC.TXT" for details)
+5-6 unused
+7 IRQ status of DMC (see "DMC.TXT" for details)
+
+
++-------------------------+
+| channel enable register |
++-------------------------+
+
+$4015(write)
+------------
+0 square wave channel 1
+1 square wave channel 2
+2 triangle wave channel
+3 noise channel
+4 DMC channel (see "DMC.TXT" for details)
+5-7 unused
+
+
+************************
+* Channel architecture *
+************************
+ This section will describe the internal components making up each individual channel. Each component will then be described in full detail.
+
+Device Triangle Noise Square
+------ -------- ------ ------
+triangle step generator X
+linear counter X
+programmable timer X X X
+length counter X X X
+4-bit DAC X X X
+volume/envelope decay unit X X
+sweep unit X
+duty cycle generator X
+wavelength converter X
+random number generator X
+
+
++-------------------------+
+| Triangle step generator |
++-------------------------+
+ This is a 5-bit, single direction counter, and it is only used in the triangle channel. Each of the 4 LSB outputs of the counter lead to one input on a corresponding mutually exclusive XNOR gate. The 4 XNOR gates have been strobed together, which results in the inverted representation of the 4 LSB of the counter appearing on the outputs of the gates when the strobe is 0, and a non-inverting action taking place when the strobe is 1. The strobe is naturally connected to the MSB of the counter, which effectively produces on the output of the XNOR gates a count sequence which reflects the scenario of a near- ideal triangle step generator (D,E,F,F,E,D,...,2,1,0,0,1,2,...). At this point, the outputs of the XNOR gates will be fed into the input of a 4-bit DAC.
+
+ This 5-bit counter will be halted whenever the Triangle channel's length or linear counter contains a count of 0. This results in a "latching" behaviour; the counter will NOT be reset to any definite state.
+
+ On system reset, this counter is loaded with 0.
+
+ The counter's clock input is connected directly to the terminal count output pin of the 11-bit programmable timer in the triangle channel. As a result of the 5-bit triangle step generator, the output triangle wave frequency will be 32 times less than the frequency of the triangle channel's programmable timer is set to generate.
+
+
++----------------+
+| Linear counter |
++----------------+
+ The linear counter is only found in the triangle channel. It is a 7-bit presettable down counter, with a decoded output condition of 0 available (not exactly the same as terminal count). Here's the bit assignments:
+
+$4008 bits
+----------
+0-6 bits 0-6 of the linear counter load register (NOT the linear counter itself)
+7 linear counter start
+
+ The counter is clocked at 240 Hz (1/4 framerate), and the calculated length in frames is 0.25*N, where N is the 7-bit loaded value. The counter is always being clocked, except when 0 appears on the output of the counter. At this point, the linear counter & triangle step counter clocks signals are disabled, which results in both counters latching their current state (the linear counter will stay at 0, and the triangle step counter will stop, and the channel will be silenced due to this).
+
+ The linear counter has 2 modes: load, and count. When the linear counter is in load mode, it essentially becomes transparent (i.e. whatever value is currently in, or being written to $4008, will appear on the output of the counter). Because of this, no count action can occur in load mode. When the mode changes from load to count, the counter will now latch the value currently in it, and start counting down from there. In the count mode, the current value of $4008 is ignored by the counter (but still retained in $4008). Described below is how the mode of the linear counter is set:
+
+
+Writes to $400B
+---------------
+cur mode
+--- ----
+1 load
+0 load (on next linear counter clock), count
+
+ Cur is the current state of the MSB of $4008.
+
+
+Writes to $4008
+---------------
+old new mode
+--- --- ----
+0 X count
+1 0 no change (during the CPU write cycle), count
+1 1 no change
+
+ Old and new represent the state(s) of the MSB of $4008. Old is the value being replaced in the MSB of $4008 on the write, and new is the value replacing the old one.
+
+ "no change" indicates that the mode of the linear counter will not change from the last.
+
+ Note that writes to $400B when $4008.7=0 only loads the linear counter with the value in $4008 on the next *linear* counter clock (and NOT at the end of the CPU write cycle). This is a correction from older versions of this doc.
+
+
++--------------------+
+| Programmable timer |
++--------------------+
+ The programmable timer is a 11-bit presettable down counter, and is found in the square, triangle, and noise channel(s). The bit assignments are as follows:
+
+$4002(sq1)/$4006(sq2)/$400A(Tri) bits
+-------------------------------------
+0-7 represent bits 0-7 of the 11-bit wavelength
+
+$4003(sq1)/$4007(sq2)/$400B(Tri) bits
+-------------------------------------
+0-2 represent bits 8-A of the 11-bit wavelength
+
+ Note that on the noise channel, the 11 bits are not available directly. See the wavelength converter section, for more details.
+
+ The counter has automatic syncronous reloading upon terminal count (count=0), therefore the counter will count for N+1 (N is the 11-bit loaded value) clock cycles before arriving at terminal count, and reloading. This counter will typically be clocked at the 2A03's internal 6502 speed (1.79 MHz), and produces an output frequency of 1.79 MHz/(N+1). The terminal count's output spike length is typically no longer than half a CPU clock. The TC signal will then be fed to the appropriate device for the particular sound channel (for square, this terminal count spike will lead to the duty cycle generator. For the triangle, the spike will be fed to the triangle step generator. For noise, this signal will go to the random number generator unit).
+
+
++----------------+
+| Length counter |
++----------------+
+ The length counter is found in all sound channels. It is essentially a 7-bit down counter, and is conditionally clocked at a frequency of 60 Hz.
+
+ When the length counter arrives at a count of 0, the counter will be stopped (stay on 0), and the appropriate channel will be silenced.
+
+ The length counter clock disable bit, found in all the channels, can also be used to halt the count sequence of the length counter for the appropriate channel, by writing a 1 out to it. A 0 condition will permit counting (unless of course, the counter's current count = 0). Location(s) of the length counter clock disable bit:
+
+$4000(sq1)/$4004(sq2)/$400C(noise) bits
+---------------------------------------
+5 length counter clock disable
+
+$4008(tri) bits
+---------------
+7 length counter clock disable
+
+ To load the length counter with a specified count, a write must be made out to the length register. Location(s) of the length register:
+
+$4003(sq1)/$4007(sq2)/$400B(tri)/$400F(noise) bits
+--------------------------------------------------
+3-7 length
+
+ The 5-bit length value written, determines what 7-bit value the length counter will start counting from. A conversion table here will show how the values are translated.
+
+ +-----------------------+
+ | bit3=0 |
+ +-------+---------------+
+ | |frames |
+ |bits +-------+-------+
+ |4-6 |bit7=0 |bit7=1 |
+ +-------+-------+-------+
+ |0 |05 |06 |
+ |1 |0A |0C |
+ |2 |14 |18 |
+ |3 |28 |30 |
+ |4 |50 |60 |
+ |5 |1E |24 |
+ |6 |07 |08 |
+ |7 |0D |10 |
+ +-------+-------+-------+
+
+ +---------------+
+ | bit3=1 |
+ +-------+-------+
+ |bits | |
+ |4-7 |frames |
+ +-------+-------+
+ |0 |7F |
+ |1 |01 |
+ |2 |02 |
+ |3 |03 |
+ |4 |04 |
+ |5 |05 |
+ |6 |06 |
+ |7 |07 |
+ |8 |08 |
+ |9 |09 |
+ |A |0A |
+ |B |0B |
+ |C |0C |
+ |D |0D |
+ |E |0E |
+ |F |0F |
+ +-------+-------+
+
+ The length counter's real-time status for each channel can be attained. A 0 is returned for a zero count status in the length counter (channel's sound is disabled), and 1 for a non-zero status. Here's the bit description of the length counter status register:
+
+$4015(read)
+-----------
+0 length counter status of square wave channel 1
+1 length counter status of square wave channel 2
+2 length counter status of triangle wave channel
+3 length counter status of noise channel
+4 length counter status of DMC (see "DMC.TXT" for details)
+5 unknown
+6 frame IRQ status
+7 IRQ status of DMC (see "DMC.TXT" for details)
+
+ Writing a 0 to the channel enable register will force the length counters to always contain a count equal to 0, which renders that specific channel disabled (as if it doesn't exist). Writing a 1 to the channel enable register disables the forced length counter value of 0, but will not change the count itself (it will still be whatever it was prior to the writing of 1).
+
+ Bit description of the channel enable register:
+
+$4015(write)
+------------
+0 enable square wave channel 1
+1 enable square wave channel 2
+2 enable triangle wave channel
+3 enable noise channel
+4 enable DMC channel (see "DMC.TXT" for details)
+5-7 unknown
+
+ Note that all 5 used bits in this register will be set to 0 upon system reset.
+
+
++-----------+
+| 4-bit DAC |
++-----------+
+ This is just a standard 4-bit DAC with 16 steps of output voltage resolution, and is used by all 4 sound channels. On the 2A03, square wave 1 & 2 are mixed together, and are available via pin 1. Triangle & noise are available on pin 2.
+
+ These analog outputs require a negative current source, to attain linear symmetry on the various output voltage levels generated by the channel(s) (moreover, to get the sound to be audible). Instead of current sources, the NES uses external 100 ohm pull-down resistors. This results in the output waveforms having some linear asymmetry (i.e., as the desired output voltage increases on a linear scale, the actual outputted voltage increases less and less each step).
+
+ The side effect of this is that the DMC's 7-bit DAC port ($4011) is able to indirectly control the volume (somewhat) of both triangle & noise channels. While I have not measured the voltage asymmetery, others on the NESdev messageboards have posted their findings. The conclusion is that when $4011 is 0, triangle & noise volume outputs are at maximum. When $4011 = 7F, the triangle & noise channel outputs operate at only 57% total volume.
+
+ The odd thing is that a few games actually take advantage of this "volume" feature, and write values to $4011 in order to regulate the amplitude of the triangle wave channel's output.
+
+
++------------------------------+
+| Volume / envelope decay unit |
++------------------------------+
+ The volume / envelope decay hardware is found only in the square wave and noise channels.
+
+$4000(sq1)/$4004(sq2)/$400C(noise)
+----------------------------------
+0-3 volume / envelope decay rate
+4 envelope decay disable
+5 envelope decay looping enable
+
+ When the envelope decay disable bit (bit 4) is set (1), the current volume value (bits 0-3) is sent directly to the channel's DAC. However, depending on certain conditions, this 4-bit volume value will be ignored, and a value of 0 will be sent to the DAC instead. This means that while the channel is enabled (producing sound), the output of the channel (what you'll hear from the DAC) will either be the 4-bit volume value, or 0. This also means that a 4-bit volume value of 0 will result in no audible sound. These conditions are as follows:
+
+ - When hardware in the channel wants to disable it's sound output (like the length counter, or sweep unit (square channels only)).
+
+ - On the negative portion of the output frequency signal coming from the duty cycle / random number generator hardware (square wave channel / noise channel).
+
+ When the envelope decay disable bit is cleared, bits 0-3 now control the envelope decay rate, and an internal 4-bit down counter (hereon the envelope decay counter) now controls the channel's volume level. "Envelope decay" is used to describe the action of the channel's audio output volume starting from a certain value, and decreasing by 1 at a fixed (linear) rate (which produces a "fade-out" sounding effect). This fixed decrement rate is controlled by the envelope decay rate (bits 0-3). The calculated decrement rate is 240Hz/(N+1), where N is any value between $0-$F.
+
+ When the channel's envelope decay counter reaches a value of 0, depending on the status of the envelope decay looping enable bit (bit 5, which is shared with the length counter's clock disable bit), 2 different things will happen:
+
+bit 5 action
+----- ------
+0 The envelope decay count will stay at 0 (channel silenced).
+1 The envelope decay count will wrap-around to $F (upon the next clock cycle). The envelope decay counter will then continue to count down normally.
+
+ Only a write out to $4003/$4007/$400F will reset the current envelope decay counter to a known state (to $F, the maximum volume level) for the appropriate channel's envelope decay hardware. Otherwise, the envelope decay counter is always counting down (by 1) at the frequency currently contained in the volume / envelope decay rate bits (even when envelope decays are disabled (setting bit 4)), except when the envelope decay counter contains a value of 0, and envelope decay looping (bit 5) is disabled (0).
+
+
++------------+
+| Sweep unit |
++------------+
+ The sweep unit is only found in the square wave channels. The controls for the sweep unit have been mapped in at $4001 for square 1, and $4005 for square 2.
+
+ The controls
+ ------------
+ Bit 7 when this bit is set (1), sweeping is active. This results in real-time increasing or decreasing of the the current wavelength value (the audible frequency will decrease or increase, respectively). The wavelength value in $4002/3 ($4006/7) is constantly read & updated by the sweep. Modifying the contents of $4002/3 will be immediately audible, and will result in the sweep now starting from this new wavelength value.
+
+ Bits 6-4 These 3 bits represent the sweep refresh rate, or the frequency at which $4002/3 is updated with the new calculated wavelength. The refresh rate frequency is 120Hz/(N+1), where N is the value written, between 0 and 7.
+
+ Bit 3 This bit controls the sweep mode. When this bit is set (1), sweeps will decrease the current wavelength value, as a 0 will increase the current wavelength.
+
+ Bits 2-0 These bits control the right shift amount of the new calculated sweep update wavelength. Code that shows how the sweep unit calculates a new sweep wavelength is as follows:
+
+bit 3
+-----
+0 New = Wavelength + (Wavelength >> N)
+1 New = Wavelength - (Wavelength >> N) (minus an additional 1, if using square wave channel 1)
+
+ where N is the the shift right value, between 0-7.
+
+ Note that in decrease mode, for subtracting the 2 values:
+ 1's compliment (NOT) is being used for square wave channel 1
+ 2's compliment (NEG) is being used for square wave channel 2
+
+ This information is currently the only known difference between the 2 square wave channels.
+
+ On each sweep refresh clock, the Wavelength register will be updated with the New value, but only if all 3 of these conditions are met:
+
+ - bit 7 is set (sweeping enabled)
+ - the shift value (which is N in the formula) does not equal to 0
+ - the channel's length counter contains a non-zero value
+
+ Notes
+ -----
+ There are certain conditions that will cause the sweep unit to silence the channel, and halt the sweep refresh clock (which effectively stops sweep action, if any). Note that these conditions pertain regardless of any sweep refresh rate values, or if sweeping is enabled/disabled (via bit 7).
+
+ - an 11-bit wavelength value less than $008 will cause this condition
+ - if the sweep unit is currently set to increase mode, the New calculated wavelength value will always be tested to see if a carry (bit $B) was generated or not (if sweeping is enabled, this carry will be examined before the Wavelength register is updated) from the shift addition calculation. If carry equals 1, the channel is silenced, and sweep action is halted.
+
+
++----------------------+
+| Duty cycle generator |
++----------------------+
+ The duty cycle generator takes the fequency produced from the 11-bit programmable timer, and uses a 4 bit counter to produce 4 types of duty cycles. The output frequency is then 1/16 that of the programmable timer. The duty cycle hardware is only found in the square wave channels. The bit assignments are as follows:
+
+$4000(sq1)/$4004(sq2)
+---------------------
+6-7 Duty cycle type
+
+ duty (positive/negative)
+val in clock cycles
+--- ---------------
+00 2/14
+01 4/12
+10 8/ 8
+11 12/ 4
+
+ Where val represents bits 6-7 of $4000/$4004.
+
+ This counter is reset when the length counter of the same channel is written to (via $4003/$4007).
+
+ The output frequency at this point will now be fed to the volume/envelope decay hardware.
+
+
++----------------------+
+| Wavelength converter |
++----------------------+
+ The wavelength converter is only used in the noise channel. It is used to convert a given 4-bit value to an 11-bit wavelength, which then is sent to the noise's own programmable timer. Here is the bit descriptions:
+
+$400E bits
+----------
+0-3 The 4-bit value to be converted
+
+ Below is a conversion chart that shows what 4-bit value will represent the 11-bit wavelength to be fed to the channel's programmable timer:
+
+value octave scale CPU clock cycles (11-bit wavelength+1)
+----- ------ ----- --------------------------------------
+0 15 A 002
+1 14 A 004
+2 13 A 008
+3 12 A 010
+4 11 A 020
+5 11 D 030
+6 10 A 040
+7 10 F 050
+8 10 C 065
+9 9 A 07F
+A 9 D 0BE
+B 8 A 0FE
+C 8 D 17D
+D 7 A 1FC
+E 6 A 3F9
+F 5 A 7F2
+
+ Octave and scale information is provided for the music enthusiast programmer who is more familiar with notes than clock cycles.
+
+
++-------------------------+
+| Random number generator |
++-------------------------+
+ The noise channel has a 1-bit pseudo-random number generator. It's based on a 15-bit shift register, and an exclusive or gate. The generator can produce two types of random number sequences: long, and short. The long sequence generates 32,767-bit long number patterns. The short sequence generates 93-bit long number patterns. The 93-bit mode will generally produce higher sounding playback frequencys on the channel. Here is the bit that controls the mode:
+
+$400E bits
+----------
+7 mode
+
+ If mode=0, then 32,767-bit long number sequences will be produced (32K mode), otherwise 93-bit long number sequences will be produced (93-bit mode).
+
+ The following diagram shows where the XOR taps are taken off the shift register to produce the 1-bit pseudo-random number sequences for each mode.
+
+mode <-----
+---- EDCBA9876543210
+32K **
+93-bit * *
+
+ The current result of the XOR will be transferred into bit position 0 of the SR, upon the next shift cycle. The 1-bit random number output is taken from pin E, is inverted, then is sent to the volume/envelope decay hardware for the noise channel. The shift register is shifted upon recieving 2 clock pulses from the programmable timer (the shift frequency will be half that of the frequency from the programmable timer (one octave lower)).
+
+ On system reset, this shift register is loaded with a value of 1.
+
+
+RP2A03E quirk
+-------------
+ I have been informed that revisions of the 2A03 before "F" actually lacked support for the 93-bit looped noise playback mode. While the Famicom's 2A03 went through 4 revisions (E..H), I think that only one was ever used for the front loading NES: "G". Other differences between 2A03 revisions are unknown.
+
+
+EOF \ No newline at end of file
diff --git a/documentation/tech/cpu/nessound.txt b/documentation/tech/cpu/nessound.txt
new file mode 100644
index 00000000..bb6d0598
--- /dev/null
+++ b/documentation/tech/cpu/nessound.txt
@@ -0,0 +1,697 @@
+The NES sound channel guide 1.8
+Written by Brad Taylor.
+btmine@hotmail.com
+
+Last updated: July 27th, 2000.
+
+All results were obtained by studying prior information available (from
+nestech 1.00, and postings on NESDev from miscellanious people), and through
+a series of experiments conducted by me. Results acquired by individuals
+prior to my reverse-engineering have been double checked, and final results
+have been confirmed. Credit is due to those individual(s) who contributed
+any information in regards to the the miscellanious sound channels wihtin
+the NES.
+
+A special thanks goes out to Matthew Conte, for his expertise on
+pseudo-random number generation (amoung other things), which allowed for the
+full reverse engineering of the NES's noise channel to take place. Without
+his help, I would still be trying to find a needle in a haystack, as far as
+the noise's method of pseudo-random number generation goes. Additionally,
+his previous findings / reverse engineering work on the NES's sound hardware
+really got the ball of NES sound emulation rolling. If it weren't for Matt's
+original work, this document wouldn't exist.
+
+Thanks to Kentaro Ishihara, for his excellent work on finding the difference
+in upward frequency sweep between the 2 square wave channels.
+
+****************
+* Introduction *
+****************
+
+The 2A03 (NES's integrated CPU) has 4 internal channels to it that have the
+ability to generate semi-analog sound, for musical playback purposes. These
+channels are 2 square wave channels, one triangle wave channel, and a noise
+generation channel. This document will go into full detail on every aspect
+of the operation and timing of the mentioned sound channels.
+
+
+*******************
+* Channel details *
+*******************
+
+Each channel has different characteristics to it that make up it's
+operation.
+
+The square channel(s) have the ability to generate a square wave frequency
+in the range of 54.6 Hz to 12.4 KHz. It's key features are frequency sweep
+abilities, and output duty cycle adjustment.
+
+The triangle wave channel has the ability to generate an output triangle
+wave with a resolution of 4-bits (16 steps), in the range of 27.3 Hz to 55.9
+KHz. The key features this channel has is it's analog triangle wave output,
+and it's linear counter, which can be set to automatically disable the
+channel's sound after a certain period of time has gone by.
+
+The noise channel is used for producing random frequencys, which results in
+a "noisey" sounding output. Output frequencys can range anywhere from 29.3
+Hz to 447 KHz. It's key feature is it's pseudo- random number generator,
+which generates the random output frequencys heard by the channel.
+
+
+*****************
+* Frame counter *
+*****************
+
+The 2A03 has an internal frame counter. It has the ability to generate 60 Hz
+(1/1 framerate), 120 Hz (1/2 framerate), and 240 Hz (1/4 framerate) signals,
+used by some of the sound hardware. The 1/4 framerate is calculated by
+taking twice the CPU clock speed (3579545.454545 Hz), and dividing it by
+14915 (i.e., the divide-by-14915 counter is decremented on the rising AND
+falling edge of the CPU's clock signal).
+
+
+************************
+* Sound hardware delay *
+************************
+
+After resetting the 2A03, the first time any sound channel(s) length counter
+contains a non-zero value (channel is enabled), there will be a 2048 CPU
+clock cycle delay before any of the sound hardware is clocked. After the 2K
+clock cycles go by, the NES sound hardware will be clocked normally. This
+phenomenon only occurs prior to a system reset, and only occurs during the
+first 2048 CPU clocks for any sound channel prior to a sound channel being
+enabled.
+
+The information in regards to this delay is only provided to keep this
+entire document persistently accurate on the 2A03's sound hardware, but may
+not be 100% accurate in itself. I haven't done much tests on the behaviour
+of this delay (mainly because I don't care, as I view it as a inconvenience
+anyway), so that's why I believe there could be some inaccuracies.
+
+
+************************
+* Register Assignments *
+************************
+
+The sound hardware internal to the 2A03 has been designated these special
+memory addresses in the CPU's memory map.
+
+$4000-$4003 Square wave 1
+$4004-$4007 Square wave 2 (identical to the first, except for upward
+frequency sweeps (see "sweep unit" section))
+$4008-$400B Triangle
+$400C-$400F Noise
+$4015 Channel enable / length counter status
+
+Note that $4015 is the only R/W register. All others are write only (attempt
+to read them will most likely result in a returned 040H, due to heavy
+capacitance on the NES's data bus). Reading a "write only" register, will
+have no effect on the specific register, or channel.
+
+Every sound channel has 4 registers affiliated with it. The description of
+the register sets are as follows:
+
++----------------+
+| Register set 1 |
++----------------+
+
+$4000(sq1)/$4004(sq2)/$400C(noise) bits
+---------------------------------------
+0-3 volume / envelope decay rate
+4 envelope decay disable
+5 length counter clock disable / envelope decay looping enable
+6-7 duty cycle type (unused on noise channel)
+
+$4008(tri) bits
+---------------
+0-6 linear counter load register
+7 length counter clock disable / linear counter start
+
+
++----------------+
+| Register set 2 |
++----------------+
+
+$4001(sq1)/$4005(sq2) bits
+--------------------------
+0-2 right shift amount
+3 decrease / increase (1/0) wavelength
+4-6 sweep update rate
+7 sweep enable
+
+$4009(tri)/$400D(noise) bits
+----------------------------
+0-7 unused
+
+
++----------------+
+| Register set 3 |
++----------------+
+
+$4002(sq1)/$4006(sq2)/$400A(Tri) bits
+-------------------------------------
+0-7 8 LSB of wavelength
+
+$400E(noise) bits
+-----------------
+0-3 playback sample rate
+4-6 unused
+7 random number type generation
+
+
++----------------+
+| Register set 4 |
++----------------+
+
+$4003(sq1)/$4007(sq2)/$400B(tri)/$400F(noise) bits
+--------------------------------------------------
+0-2 3 MS bits of wavelength (unused on noise channel)
+3-7 length counter load register
+
+
++--------------------------------+
+| length counter status register |
++--------------------------------+
+
+$4015(read)
+-----------
+0 square wave channel 1
+1 square wave channel 2
+2 triangle wave channel
+3 noise channel
+4 DMC (see "DMC.TXT" for details)
+5-6 unused
+7 IRQ status of DMC (see "DMC.TXT" for details)
+
+
++-------------------------+
+| channel enable register |
++-------------------------+
+
+$4015(write)
+------------
+0 square wave channel 1
+1 square wave channel 2
+2 triangle wave channel
+3 noise channel
+4 DMC channel (see "DMC.TXT" for details)
+5-7 unused
+
+
+************************
+* Channel architecture *
+************************
+
+This section will describe the internal components making up each individual
+channel. Each component will then be described in full detail.
+
+Device Triangle Noise Square
+------ -------- ------ ------
+triangle step generator X
+linear counter X
+programmable timer X X X
+length counter X X X
+4-bit DAC X X X
+volume/envelope decay unit X X
+sweep unit X
+duty cycle generator X
+wavelength converter X
+random number generator X
+
+
++-------------------------+
+| Triangle step generator |
++-------------------------+
+
+This is a 5-bit, single direction counter, and it is only used in the
+triangle channel. Each of the 4 LSB outputs of the counter lead to one input
+on a corresponding mutually exclusive XNOR gate. The 4 XNOR gates have been
+strobed together, which results in the inverted representation of the 4 LSB
+of the counter appearing on the outputs of the gates when the strobe is 0,
+and a non-inverting action taking place when the strobe is 1. The strobe is
+naturally connected to the MSB of the counter, which effectively produces on
+the output of the XNOR gates a count sequence which reflects the scenario of
+a near- ideal triangle step generator (D,E,F,F,E,D,...,2,1,0,0,1,2,...). At
+this point, the outputs of the XNOR gates will be fed into the input of a
+4-bit DAC.
+
+This 5-bit counter will be halted whenever the Triangle channel's length or
+linear counter contains a count of 0. This results in a "latching"
+behaviour; the counter will NOT be reset to any definite state.
+
+On system reset, this counter is loaded with 0.
+
+The counter's clock input is connected directly to the terminal count output
+pin of the 11-bit programmable timer in the triangle channel. As a result of
+the 5-bit triangle step generator, the output triangle wave frequency will
+be 32 times less than the frequency of the triangle channel's programmable
+timer is set to generate.
+
+
++----------------+
+| Linear counter |
++----------------+
+
+The linear counter is only found in the triangle channel. It is a 7-bit
+presettable down counter, with a decoded output condition of 0 available
+(not exactly the same as terminal count). Here's the bit assignments:
+
+$4008 bits
+----------
+0-6 bits 0-6 of the linear counter load register (NOT the linear counter
+itself)
+7 linear counter start
+
+The counter is clocked at 240 Hz (1/4 framerate), and the calculated length
+in frames is 0.25*N, where N is the 7-bit loaded value. The counter is
+always being clocked, except when 0 appears on the output of the counter. At
+this point, the linear counter & triangle step counter clocks signals are
+disabled, which results in both counters latching their current state (the
+linear counter will stay at 0, and the triangle step counter will stop, and
+the channel will be silenced due to this).
+
+The linear counter has 2 modes: load, and count. When the linear counter is
+in load mode, it essentially becomes transparent (i.e. whatever value is
+currently in, or being written to $4008, will appear on the output of the
+counter). Because of this, no count action can occur in load mode. When the
+mode changes from load to count, the counter will now latch the value
+currently in it, and start counting down from there. In the count mode, the
+current value of $4008 is ignored by the counter (but still retained in
+$4008). Described below is how the mode of the linear counter is set:
+
+Writes to $400B
+---------------
+cur mode
+--- ----
+1 load
+0 load (during the write cycle), count
+
+Cur is the current state of the MSB of $4008.
+
+Writes to $4008
+---------------
+old new mode
+--- --- ----
+0 X count
+1 0 no change (during the write cycle), count
+1 1 no change
+
+Old and new represent the state(s) of the MSB of $4008. Old is the value
+being replaced in the MSB of $4008 on the write, and new is the value
+replacing the old one.
+
+"no change" indicates that the mode of the linear counter will not change
+from the last.
+
+
++--------------------+
+| Programmable timer |
++--------------------+
+
+The programmable timer is a 11-bit presettable down counter, and is found in
+the square, triangle, and noise channel(s). The bit assignments are as
+follows:
+
+$4002(sq1)/$4006(sq2)/$400A(Tri) bits
+-------------------------------------
+0-7 represent bits 0-7 of the 11-bit wavelength
+
+$4003(sq1)/$4007(sq2)/$400B(Tri) bits
+-------------------------------------
+0-2 represent bits 8-A of the 11-bit wavelength
+
+Note that on the noise channel, the 11 bits are not available directly. See
+the wavelength converter section, for more details.
+
+The counter has automatic syncronous reloading upon terminal count
+(count=0), therefore the counter will count for N+1 (N is the 11-bit loaded
+value) clock cycles before arriving at terminal count, and reloading. This
+counter will typically be clocked at the 2A03's internal 6502 speed (1.79
+MHz), and produces an output frequency of 1.79 MHz/(N+1). The terminal
+count's output spike length is typically no longer than half a CPU clock.
+The TC signal will then be fed to the appropriate device for the particular
+sound channel (for square, this terminal count spike will lead to the duty
+cycle generator. For the triangle, the spike will be fed to the triangle
+step generator. For noise, this signal will go to the random number
+generator unit).
+
+
++----------------+
+| Length counter |
++----------------+
+
+The length counter is found in all sound channels. It is essentially a 7-bit
+down counter, and is conditionally clocked at a frequency of 60 Hz.
+
+When the length counter arrives at a count of 0, the counter will be stopped
+(stay on 0), and the appropriate channel will be silenced.
+
+The length counter clock disable bit, found in all the channels, can also be
+used to halt the count sequence of the length counter for the appropriate
+channel, by writing a 1 out to it. A 0 condition will permit counting
+(unless of course, the counter's current count = 0). Location(s) of the
+length counter clock disable bit:
+
+$4000(sq1)/$4004(sq2)/$400C(noise) bits
+---------------------------------------
+5 length counter clock disable
+
+$4008(tri) bits
+---------------
+7 length counter clock disable
+
+To load the length counter with a specified count, a write must be made out
+to the length register. Location(s) of the length register:
+
+$4003(sq1)/$4007(sq2)/$400B(tri)/$400F(noise) bits
+--------------------------------------------------
+3-7 length
+
+The 5-bit length value written, determines what 7-bit value the length
+counter will start counting from. A conversion table here will show how the
+values are translated.
+
+ +-----------------------+
+ | bit3=0 |
+ +-------+---------------+
+ | |frames |
+ |bits +-------+-------+
+ |4-6 |bit7=0 |bit7=1 |
+ +-------+-------+-------+
+ |0 |05 |06 |
+ |1 |0A |0C |
+ |2 |14 |18 |
+ |3 |28 |30 |
+ |4 |50 |60 |
+ |5 |1E |24 |
+ |6 |07 |08 |
+ |7 |0E |10 |
+ +-------+-------+-------+
+
+ +---------------+
+ | bit3=1 |
+ +-------+-------+
+ |bits | |
+ |4-7 |frames |
+ +-------+-------+
+ |0 |7F |
+ |1 |01 |
+ |2 |02 |
+ |3 |03 |
+ |4 |04 |
+ |5 |05 |
+ |6 |06 |
+ |7 |07 |
+ |8 |08 |
+ |9 |09 |
+ |A |0A |
+ |B |0B |
+ |C |0C |
+ |D |0D |
+ |E |0E |
+ |F |0F |
+ +-------+-------+
+
+The length counter's real-time status for each channel can be attained. A 0
+is returned for a zero count status in the length counter (channel's sound
+is disabled), and 1 for a non-zero status. Here's the bit description of the
+length counter status register:
+
+$4015(read)
+-----------
+0 length counter status of square wave channel 1
+1 length counter status of square wave channel 2
+2 length counter status of triangle wave channel
+3 length counter status of noise channel
+4 length counter status of DMC (see "DMC.TXT" for details)
+5-6 unused
+7 IRQ status of DMC (see "DMC.TXT" for details)
+
+Writing a 0 to the channel enable register will force the length counters to
+always contain a count equal to 0, which renders that specific channel
+disabled (as if it doesn't exist). Writing a 1 to the channel enable
+register disables the forced length counter value of 0, but will not change
+the count itself (it will still be whatever it was prior to the writing of
+1).
+
+Bit description of the channel enable register:
+
+$4015(write)
+------------
+0 enable square wave channel 1
+1 enable square wave channel 2
+2 enable triangle wave channel
+3 enable noise channel
+4 enable DMC channel (see "DMC.TXT" for details)
+5-7 unused
+
+Note that all 5 used bits in this register will be set to 0 upon system
+reset.
+
+
++-----------+
+| 4-bit DAC |
++-----------+
+
+This is just a standard 4-bit DAC with 16 steps of output voltage
+resolution, and is used by all 4 sound channels.
+
+On the 2A03, square wave 1 & 2 are mixed together, and are available via pin
+1. Triangle & noise are available on pin 2. These analog outputs require a
+negative current source, to attain linear symmetry on the various output
+voltage levels generated by the channel(s) (moreover, to get the sound to be
+audible). Since the NES just uses external 100 ohm pull-down resistors, this
+results in the output waveforms being of very small amplitude, but with
+minimal linearity asymmetry.
+
+
++------------------------------+
+| Volume / envelope decay unit |
++------------------------------+
+
+The volume / envelope decay hardware is found only in the square wave and
+noise channels.
+
+$4000(sq1)/$4004(sq2)/$400C(noise)
+----------------------------------
+0-3 volume / envelope decay rate
+4 envelope decay disable
+5 envelope decay looping enable
+
+When the envelope decay disable bit (bit 4) is set (1), the current volume
+value (bits 0-3) is sent directly to the channel's DAC. However, depending
+on certain conditions, this 4-bit volume value will be ignored, and a value
+of 0 will be sent to the DAC instead. This means that while the channel is
+enabled (producing sound), the output of the channel (what you'll hear from
+the DAC) will either be the 4-bit volume value, or 0. This also means that a
+4-bit volume value of 0 will result in no audible sound. These conditions
+are as follows:
+
+- When hardware in the channel wants to disable it's sound output (like the
+length counter, or sweep unit (square channels only)).
+
+- On the negative portion of the output frequency signal coming from the
+duty cycle / random number generator hardware (square wave channel / noise
+channel).
+
+When the envelope decay disable bit is cleared, bits 0-3 now control the
+envelope decay rate, and an internal 4-bit down counter (hereon the envelope
+decay counter) now controls the channel's volume level. "Envelope decay" is
+used to describe the action of the channel's audio output volume starting
+from a certain value, and decreasing by 1 at a fixed (linear) rate (which
+produces a "fade-out" sounding effect). This fixed decrement rate is
+controlled by the envelope decay rate (bits 0-3). The calculated decrement
+rate is 240Hz/(N+1), where N is any value between $0-$F.
+
+When the channel's envelope decay counter reaches a value of 0, depending on
+the status of the envelope decay looping enable bit (bit 5, which is shared
+with the length counter's clock disable bit), 2 different things will
+happen:
+
+bit 5 action
+----- ------
+0 The envelope decay count will stay at 0 (channel silenced).
+1 The envelope decay count will wrap-around to $F (upon the next clock
+cycle). The envelope decay counter will then continue to count down
+normally.
+
+Only a write out to $4003/$4007/$400F will reset the current envelope decay
+counter to a known state (to $F, the maximum volume level) for the
+appropriate channel's envelope decay hardware. Otherwise, the envelope decay
+counter is always counting down (by 1) at the frequency currently contained
+in the volume / envelope decay rate bits (even when envelope decays are
+disabled (setting bit 4)), except when the envelope decay counter contains a
+value of 0, and envelope decay looping (bit 5) is disabled (0).
+
+
++------------+
+| Sweep unit |
++------------+
+
+The sweep unit is only found in the square wave channels. The controls for
+the sweep unit have been mapped in at $4001 for square 1, and $4005 for
+square 2.
+
+The controls
+------------
+Bit 7 when this bit is set (1), sweeping is active. This results in
+real-time increasing or decreasing of the the current wavelength value (the
+audible frequency will decrease or increase, respectively). The wavelength
+value in $4002/3 ($4006/7) is constantly read & updated by the sweep.
+Modifying the contents of $4002/3 will be immediately audible, and will
+result in the sweep now starting from this new wavelength value.
+
+Bits 6-4 These 3 bits represent the sweep refresh rate, or the frequency at
+which $4002/3 is updated with the new calculated wavelength. The refresh
+rate frequency is 120Hz/(N+1), where N is the value written, between 0 and
+7.
+
+Bit 3 This bit controls the sweep mode. When this bit is set (1), sweeps
+will decrease the current wavelength value, as a 0 will increase the current
+wavelength.
+
+Bits 2-0 These bits control the right shift amount of the new calculated
+sweep update wavelength. Code that shows how the sweep unit calculates a new
+sweep wavelength is as follows:
+
+bit 3
+-----
+0 New = Wavelength + (Wavelength >> N)
+1 New = Wavelength - (Wavelength >> N) (minus an additional 1, if using
+square wave channel 1)
+
+where N is the the shift right value, between 0-7.
+
+Note that in decrease mode, for subtracting the 2 values:
+1's compliment (NOT) is being used for square wave channel 1
+2's compliment (NEG) is being used for square wave channel 2
+
+This information is currently the only known difference between the 2 square
+wave channels.
+
+On each sweep refresh clock, the Wavelength register will be updated with
+the New value, but only if all 3 of these conditions are met:
+
+- bit 7 is set (sweeping enabled)
+- the shift value (which is N in the formula) does not equal to 0
+- the channel's length counter contains a non-zero value
+
+Notes
+-----
+There are certain conditions that will cause the sweep unit to silence the
+channel, and halt the sweep refresh clock (which effectively stops sweep
+action, if any). Note that these conditions pertain regardless of any sweep
+refresh rate values, or if sweeping is enabled/disabled (via bit 7).
+
+- an 11-bit wavelength value less than $008 will cause this condition
+- if the sweep unit is currently set to increase mode, the New calculated
+wavelength value will always be tested to see if a carry (bit $B) was
+generated or not (if sweeping is enabled, this carry will be examined before
+the Wavelength register is updated) from the shift addition calculation. If
+carry equals 1, the channel is silenced, and sweep action is halted.
+
+
++----------------------+
+| Duty cycle generator |
++----------------------+
+
+The duty cycle generator takes the fequency produced from the 11-bit
+programmable timer, and uses a 4 bit counter to produce 4 types of duty
+cycles. The output frequency is then 1/16 that of the programmable timer.
+The duty cycle hardware is only found in the square wave channels. The bit
+assignments are as follows:
+
+$4000(sq1)/$4004(sq2)
+---------------------
+6-7 Duty cycle type
+
+ duty (positive/negative)
+val in clock cycles
+--- ---------------
+00 2/14
+01 4/12
+10 8/ 8
+11 12/ 4
+
+Where val represents bits 6-7 of $4000/$4004.
+
+The output frequency at this point will now be fed to the volume/envelope
+decay hardware.
+
+
++----------------------+
+| Wavelength converter |
++----------------------+
+
+The wavelength converter is only used in the noise channel. It is used to
+convert a given 4-bit value to an 11-bit wavelength, which then is sent to
+the noise's own programmable timer. Here is the bit descriptions:
+
+$400E bits
+----------
+0-3 The 4-bit value to be converted
+
+Below is a conversion chart that shows what 4-bit value will represent the
+11-bit wavelength to be fed to the channel's programmable timer:
+
+value octave scale CPU clock cycles (11-bit wavelength+1)
+----- ------ ----- --------------------------------------
+0 15 A 002
+1 14 A 004
+2 13 A 008
+3 12 A 010
+4 11 A 020
+5 11 D 030
+6 10 A 040
+7 10 F 050
+8 10 C 065
+9 9 A 07F
+A 9 D 0BE
+B 8 A 0FE
+C 8 D 17D
+D 7 A 1FC
+E 6 A 3F9
+F 5 A 7F2
+
+Octave and scale information is provided for the music enthusiast programmer
+who is more familiar with notes than clock cycles.
+
+
++-------------------------+
+| Random number generator |
++-------------------------+
+
+The noise channel has a 1-bit pseudo-random number generator. It's based on
+a 15-bit shift register, and an exclusive or gate. The generator can produce
+two types of random number sequences: long, and short. The long sequence
+generates 32,767-bit long number patterns. The short sequence generates
+93-bit long number patterns. The 93-bit mode will generally produce higher
+sounding playback frequencys on the channel. Here is the bit that controls
+the mode:
+
+$400E bits
+----------
+7 mode
+
+If mode=0, then 32,767-bit long number sequences will be produced (32K
+mode), otherwise 93-bit long number sequences will be produced (93-bit
+mode).
+
+The following diagram shows where the XOR taps are taken off the shift
+register to produce the 1-bit pseudo-random number sequences for each mode.
+
+mode <-----
+---- EDCBA9876543210
+32K **
+93-bit * *
+
+The current result of the XOR will be transferred into bit position 0 of the
+SR, upon the next shift cycle. The 1-bit random number output is taken from
+pin E, is inverted, then is sent to the volume/envelope decay hardware for
+the noise channel. The shift register is shifted upon recieving 2 clock
+pulses from the programmable timer (the shift frequency will be half that of
+the frequency from the programmable timer (one octave lower)).
+
+On system reset, this shift register is loaded with a value of 1.
+
+
diff --git a/documentation/tech/exp/mmc5-e.txt b/documentation/tech/exp/mmc5-e.txt
new file mode 100644
index 00000000..eab191af
--- /dev/null
+++ b/documentation/tech/exp/mmc5-e.txt
@@ -0,0 +1,250 @@
+========= mmc5 infomation ==========
+date 1998/05/31
+by goroh
+translated May 31, 1998 by Sgt. Bowhack
+mail goroh_kun@geocities.co.jp
+
+5000,5004 ch1,ch2 Pulse Control
+ bit CCwevvvv
+ CC Duty Cycle (Positive vs. Negative)
+ #0:87.5% #1:75.0% #2:50.0% #3:25.0%
+ w Waveform Hold (e.g. Looping)
+ 0: Off 1: On
+ e Envelope Select
+ 0: Varied 1: Fixed
+ < e=0 >
+ vvvv Playback Rate
+ #0<-fast<--->-slow--> #15
+ < e=1 >
+ vvvv Output Volume
+
+5002,5006 ch1,ch2 frequency L
+ bit ffffffff
+5003,5007 ch1,ch2 frequency H
+ bit tttttfff
+ ttttt sound occurence time
+
+Objective is to remove the continuous changing of frequency for
+square wave setup and do the same to the main part of the square wave
+of studying the main part of the famicom. (?- Sgt. Bowhack)
+
+5010 ch3 synthetic voice business channel
+ bit -------O
+ O wave output 0:Off 1:On
+
+5011 ch4 synthetic voice business channel 2
+ bit vvvvvvvv
+ vvvvvvvv wave size
+
+5015 sound output channel
+ bit ------BA
+ A: ch1 output 1:enable 0:disable
+ B: ch2 output 1:enable 0:disable
+
+5100 PRG-page size Setting
+ bit ------SS
+ SS PRG-page size
+ 0: 32k 1:16k 2,3:8k
+* Reset is misled the first times for about 8k (?- SB)
+
+5101 CHR-page size Setting
+ bit ------SS
+ SS CHR-page size
+ 0:8k 1:4k 2:2k 3:1k
+
+5102 W BBR-RAM Write Protect 1
+ bit ------AA
+5103 W BBR-RAM Write Protect 2
+ bit ------BB
+ (AA,BB) = (2,1) permitted to write to BBR-RAM only when crowded
+*Reset write around becomes prohibited when crowded
+
+5104 Grafix Mode Setting
+ $5c00-$5fff decides how it should be used
+ bit ------MM
+ #00:Enable only Split Mode
+ #01:Enable Split Mode & ExGrafix Mode
+ #02:ExRAM Mode
+ #03:ExRAM Mode & Write Protect
+
+Consideration
+ MMC5 has 2 graphic mode extensions that allow more than 256 characters
+on one standard game screen. It uses Split Mode so it can display the
+specified CHR-page and scroll position seperate from ExGrafix Mode to
+be able to choose a palette, and the other divides it vertically.
+
+5105 W NameTable Setting
+ bit ddccbbaa
+ aa: Select VRAM at 0x2000-0x23ff
+ bb: Select VRAM at 0x2400-0x27ff
+ cc: Select VRAM at 0x2800-0x2bff
+ dd: Select VRAM at 0x2c00-0x2fff
+ #0:use VRAM 0x000-0x3ff
+ #1:use VRAM 0x400-0x7ff
+ #2:use ExVRAM 0x000-0x3ff
+ #3:use ExNameTable(Fill Mode)
+
+Consideration
+ The name table can designate 4 kinds of this resister and be a useful
+special quality for this because painting and smashing it with a
+character that there is 1 sheet for the remaining sheets can generally
+be used. (?-SB)
+
+5106 W Fill Mode Setting 1
+ bit vvvvvvvv
+ Fill chr-table
+ For whether it paints or smashes it at any non-designated character
+
+5107 W Fill Mode Setting 2
+ bit ------pp
+ Whether or not it uses any non-designated palettes
+
+5113 RAM-page for $6000-$7FFF
+ bit -----p--
+
+5114-5117 Program Bank switch
+ < page_size=32k >
+ $5117 [8]-[F] bit pppppp--
+
+ < page_size=16k >
+ $5115 [8]-[B] bit ppppppp-
+ $5117 [C]-[F] bit ppppppp-
+
+ < page_size=8k >
+ $5114 [8][9] bit pppppppp
+ $5115 [A][B] bit pppppppp
+ $5116 [C][D] bit pppppppp
+ $5117* [E][F] bit pppppppp
+
+*Reset is around early, Last Page misled
+
+5120-512b Charactor Bank switch
+ < page_size=8k >
+ $5120-$5127 switch to mode A
+ $5128-$512b switch to mode B
+ $5127 [0]-[7] modeA
+ $512b [0]-[7] modeB
+
+ < page_size=4k >
+ $5120-$5127 switch to mode A
+ $5128-$512b switch to mode B
+ $5123 [0]-[3] modeA
+ $5127 [4]-[7] modeA
+ $512b [0]-[3],[4]-[7] modeB
+
+ < page_size=2k >
+ $5120-$5127 switch to mode A
+ $5128-$512b switch to mode B
+ $5121 [0]-[1] modeA
+ $5123 [2]-[3] modeA
+ $5125 [4]-[5] modeA
+ $5127 [6]-[7] modeA
+ $5129 [0]-[1],[4]-[5] modeB
+ $512b [2]-[3],[6]-[7] modeB
+
+ < page_size=1k >
+ $5120-$5127 switch to mode A
+ $5128-$512b switch to mode B
+ $5120 [0] modeA
+ $5121 [1] modeA
+ $5122 [2] modeA
+ $5123 [3] modeA
+ $5124 [4] modeA
+ $5125 [5] modeA
+ $5126 [6] modeA
+ $5127 [7] modeA
+ $5128 [0],[4] modeB
+ $5129 [1],[5] modeB
+ $512a [2],[6] modeB
+ $512b [3],[7] modeB
+
+Consideration
+ MMC5 has mode A ,mode B and 2 kinds of CHR-page memory resistors.
+They can be used for refreshing it. (?-SB)
+
+5130 ???
+analyzing it...
+
+5200 W Split Mode Control 1
+ bit Ec-vvvvv
+ For the E function 0:don't use 1:use
+ c boundary's side is for using Split Mode extension of graphics
+ 0: left side 1: right side
+ vvvvv left boundary is designated with the char. # to count places
+
+Sample.
+ 5200 <- #00
+ (not?) used yet
+ 5200 <- #82
+ Used for SplitMode GFX extension from left 1-2 character
+ 5200 <- #c2
+ Used for SplitMode GFX extension from the right side 3 chars.
+ 5200 <- #c0
+ Used for SplitMode GFX extension on the whole screen
+ 5200 <- #d0
+ Used for SplitMode GFX extension on the right side of the screen
+ 5200 <- #90
+ Used for SplitMode GFX extension on the left side of the screen
+
+5201 W SplitMode setup for SplitMode Ext. GFX use 1
+ $2005 determines the vertical movement; it can also delay ext. gfx's
+ vert. movement if necessary. It's written 2 times in bulk in the same
+ way as it would slip off a grade in $2005 (??-SB)
+
+5202 W SplitMode setup for SplitMode Ext. GFX use 2
+ bit --pppppp
+ uses vertical division of ext. gfx CHR-page designation
+ index_size=4k(0x1000byte)
+In case it uses a character 0x4000-0x4fff for the ext. gfx in question
+ $5202 <- 4
+
+5203 W scanline break point
+ For scanline # that it splits and wants to make it designate it in bulk
+
+5204 WR IRQ enable/disable
+ W bit I-------
+ I 1:IRQ Enable 0:IRQ Disable
+ R bit I-------
+ I 1:Scanline Hit 0:Scanline not Hit
+ $5203 is designated as scanline when arrived.
+
+5205 WR mult input/output
+5206 WR mult input/output
+($5205in)*($5206in) = $5205,$5206out
+
+5c00-5fbf ext. gfx business VRAM
+ shows an attribute of every position character
+
+ <ExGrafix Mode>
+ bit PPpppppp
+ PP: use character palette number
+ pppppp: use background CHR-PAGE number index=4k
+ #0-#3F are designations, $0000-$3FFF is CHR-data's range
+ Use for extension gfx
+
+ <Split Mode>
+ SplitMode uses a Name Table for extension gfx use.
+ bit pppppppp
+ pppppppp: use for background char. number designation
+
+ <ExRAM Mode>
+ Used for Extension RAM
+
+5fc0-5fff
+ <ExGrafix Mode>
+ (not?) used yet
+
+ <Split Mode>
+ SplitMode uses gfx's Attribute Table extension.
+ PPU uses $23c0-$23ff in the same way as the Attribute Table
+
+ <ExRAM Mode>
+ Used for Extension RAM
+
+Consideration
+ 5c00-5fff has 3 uses.
+ Split Mode and ExGrafix Mode's VBlank is written so as to become
+ crowded, it writes a 0 and becomes crowded.
+ Every mode tries to go around ExRAM mode including reading but it
+ writes it, is effective in bulk and #5c-#5f is the output at times
+ where it is effective. \ No newline at end of file
diff --git a/documentation/tech/exp/smb2j.txt b/documentation/tech/exp/smb2j.txt
new file mode 100644
index 00000000..074b1d3d
--- /dev/null
+++ b/documentation/tech/exp/smb2j.txt
@@ -0,0 +1,112 @@
+ SMB2j Revision "A". Mapper #50 Info
+ -----------------------------------
+
+
+12.09.2000
+V2.00
+
+Mapper info by The Mad Dumper
+---
+
+
+This mapper has been assigned the number 50. (that's 50 decimal)
+
+
+Wow, another SMB2j cart! This one is alot different than the last one I
+worked on. It has a single 128K PRG ROM and VRAM. The other SMB2j had
+64K PRG and 8K CHR. Not much more to say other than this has one very
+fucked up mapper circuit on it!
+
+---
+
+
+The hardware:
+
+
+It consists of 6 TTL chips (74163, 74157, 74139, 7400, 7474, and a 4020),
+1 8K RAM chip for the VRAM, and 1 128K 28 pin ROM. There is some
+"M^2L" logic on the board (Mickey-Mouse Logic). It is a 3 input OR gate
+made out of 3 diodes and a resistor.
+
+Also, they swapped D3 and D6, as well as A1 and A3. Why this was done, I
+have no idea. It sure mussed up my REing efforts! I desoldered and read
+the ROM out through the EPROM programmer as a check and was not happy to
+find the data seemingly corrupt!
+
+After converting the ROM image over via some QBasic, it matched up great.
+You do not have to swap the addresses or data bytes; I have done this
+already in the released .NES ROM.
+
+---
+
+There are two registers on this cartridge. They are located in the 4000h-
+5FFFh block.
+
+Funny addresses are decoded for the register selection; presumably so they
+did not interfere with the FDS or NES registers.
+
+
+A15 ... A0
+-------------------
+010x xxxQ x01x xxxx
+
+
+x = Don't Care
+0 = must be 0
+1 = must be 1
+Q = register selection bit. 0 = ROM Page; 1 = IRQ Timer Enable
+
+-
+
+ROM Page Register:
+------------------
+
+Accessed when the address lines are in the above state. An example address
+would be 4020h. 4021h, 4022h, ... 403Fh, 40A0h, 40A1h, ... are all mirrors
+of this register. Writing here stores the desired bank #.
+
+7 bit 0
+---------
+xxxx DCBA
+
+These 4 bits are shown below in the ROM memory map. Note that they are
+somewhat "scrambled". The value of this register is unknown at powerup.
+
+-
+
+IRQ Timer.
+
+7 bit 0
+---------
+xxxx xxxI
+
+The IRQ Timer register controls the state of the IRQ timer. Writing here
+will turn it on or off. Turning the IRQ timer off resets it to 0. Writing
+a 1 here will turn the timer on, while writing a 0 will turn it off.
+
+The timer is composed of a binary ripple counter. After 4096 M2 cycles,
+/IRQ is pulled low. This is about 36 scanlines. The idea behind the timer
+is to split the screen for the score bar at the top. You start it at the
+beginning of the VBI routine, and then after 36 scanlines, it sends the IRQ
+which clears the timer, and resets the scroll registers. The value of this
+register is unknown at powerup.
+
+---
+
+ROM Memory Map:
+
+
+Address Range | Bank bits: 3210
+-------------------------------
+ 6000h-7FFFh 1111
+ 8000h-9FFFh 1000
+ A000h-BFFFh 1001
+ C000h-DFFFh DACB -- Selectable page
+ E000h-FFFFh 1011
+
+
+The ROM is composed of 16 8K banks. The 4 bank bits are shown above. Bit 3
+is the MSB while bit 0 is the LSB. 6000h-7FFFh is set to 1111b, or bank 0fh.
+All banks are FIXED except the bank at C000h-DFFFh. Only it can be changed.
+
+
diff --git a/documentation/tech/exp/tengen.txt b/documentation/tech/exp/tengen.txt
new file mode 100644
index 00000000..cc84d596
--- /dev/null
+++ b/documentation/tech/exp/tengen.txt
@@ -0,0 +1,18 @@
+Unknown:
+
+Alien Syndrome 128KiB/128KiB
+Super Sprint
+
+MIMIC 1:
+
+Fantasy Zone 64KiB/64KiB
+Toobin' 128KiB/64KiB
+Vindicators 64KiB/32KiB
+
+
+RAMBO 1(board looks like it can take 256KiB PRG/256KiB CHR max):
+
+Klax 64KiB PRG/64KiB CHR
+Road Runner 64KiB PRG/128KiB CHR
+Rolling Thunder 128KiB PRG/128KiB CHR
+Skull and Crossbones 128KiB PRG/64KiB CHR
diff --git a/documentation/tech/exp/vrcvi.txt b/documentation/tech/exp/vrcvi.txt
new file mode 100644
index 00000000..52023577
--- /dev/null
+++ b/documentation/tech/exp/vrcvi.txt
@@ -0,0 +1,388 @@
+ VRCVI CHIP INFO
+ ----- ---- ----
+
+ By:
+
+
+ Kevin Horton
+ khorton@iquest.net
+
+
+
+
+ The RENES Project:
+ Reverse-engineering
+ the world.
+
+
+
+
+V1.01 08/31/99 teeny fix
+V1.00 08/31/99 Complete Version
+
+
+
+VRCVI (VRC6) (48 pin standard 600mil wide DIP)
+------------
+This chip is used in such games as Konami's CV3j and Madara. It's unique
+because it has some extra sound channels on it that get piped through the
+Famicom (note this is a fami-only chip and you will not find one in any
+NES game). "VI" of "VRCVI" is "6" for the roman numeral challenged.
+
+
+This chip generates its audio via a 6 bit R2R ladder. This is contained
+inside a 9 pin resistor network like so:
+
+
+ 3K 3K 3K 3K 3K 2K
+ /------*-\/\/-*-\/\/-*-\/\/-*-\/\/-*-\/\/-*------*-\/\/-\
+ | | | | | | | | |
+ \ \ \ \ \ \ \ | |
+ / / / / / / / | |
+ \ 6K \ 6K \ 6K \ 6K \ 6K \ 6K \ 6K | |
+ / / / / / / / | |
+ | | | | | | | | |
+ O O O O O O O O O
+
+ GND D0 D1 D2 D3 D4 D5 Aud In Aud Out
+
+
+Legend:
+-------
+
+(s) means this pin connects to the System
+(r) this only connects to the ROM
+(w) this is a SRAM/WRAM connection only
+AUD : these pass to the resistor network
+CHR : these connect to the CHR ROM and/or fami's CHR pins
+PRG : these connect to the PRG ROM and/or fami's PRG pins
+WRAM : this hooks to the WRAM
+CIRAM : the RAM chip which is on the fami board
+
+ .----\/----.
+ GND - |01 48| - +5V
+ AUD D1 - |02 47| - AUD D0
+ AUD D3 - |03 46| - AUD D2
+ AUD D5 - |04 45| - AUD D4
+ (s) PRG A12 - |05 44| - PRG A16 (r)
+ (s) PRG A14 - |06 43| - PRG A13 (s)
+ (s) M2 - |07 42| - PRG A17 (r)
+ (r) PRG A14 - |08 41| - PRG A15 (r)
+ *1 (s) PRG A1 - |09 40| - PRG A13 (r)
+ *1 (s) PRG A0 - |10 39| - PRG D7 (s)
+ (s) PRG D0 - |11 38| - PRG D6 (s)
+ (s) PRG D1 - |12 37| - PRG D5 (s)
+ (s) PRG D2 - |13 36| - PRG D4 (s)
+ (r) PRG /CE - |14 35| - PRG D3 (s)
+ (s) R/W - |15 34| - PRG /CE (s)
+ *2 (w) WRAM /CE - |16 33| - /IRQ (s)
+ (r) CHR /CE - |17 32| - CIRAM /CE (s)
+ (s) CHR /RD - |18 31| - CHR A10 (s)
+ (s) CHR /A13 - |19 30| - CHR A11 (s)
+ (r) CHR A16 - |20 29| - CHR A12 (s)
+ (r) CHR A15 - |21 28| - CHR A17 (r)
+ (r) CHR A12 - |22 27| - CHR A14 (r)
+ (r) CHR A11 - |23 26| - CHR A13 (r)
+ GND - |24 25| - CHR A10 (r)
+ | |
+ `----------'
+
+ VRCVI
+
+
+*1: On some VRCVI carts, these are reversed. This affects some registers.
+
+*2: This passes through a small pulse shaping network consisting of a
+ resistor, diode, and cap.
+
+
+Registers: (sound related only)
+----------
+
+regs 9000-9002 are for pulse channel #1
+regs a000-a002 are for pulse channel #2
+regs b000-b002 are for the phase accumulator channel (sawtooth)
+
+(bits listed 7 through 0)
+
+9000h: GDDDVVVV
+
+ D - Duty Cycle bits:
+
+ 000 - 1/16th ( 6.25%)
+ 001 - 2/16ths (12.50%)
+ 010 - 3/16ths (18.75%)
+ 011 - 4/16ths (25.00%)
+ 100 - 5/16ths (31.25%)
+ 101 - 6/16ths (37.50%)
+ 110 - 7/16ths (43.75%)
+ 111 - 8/16ths (50.00%)
+
+ V - Volume bits. 0000b is silence, 1111b is loudest. Volume is
+ linear. When in "normal" mode (see G bit), this acts as a general
+ volume control register. When in "digitized" mode, these act as a
+ 4 bit sample input.
+
+ G - Gate bit. 0=normal operation, 1=digitized. In digi operation,
+ registers 9001h and 9002h are totally disabled. Only bits 0-3 of
+ 9000h are used. Whatever binary word is present here is passed on
+ as a 4 bit digitized output.
+
+
+9002h: FFFFFFFF
+
+ F - Lower 8 bits of frequency data
+
+
+9003h: X---FFFF
+
+ X - Channel disable. 0=channel disabled, 1=channel enabled.
+
+ F - Upper 4 bits of frequency data
+
+
+A000h-A002h are identical in operation to 9000h-9002h. One note: this chip
+will mix both digitized outputs (if the G bits are both set) into one
+added output. (see in-depth chip operation below)
+
+B000h: --PPPPPP
+
+ P - Phase accumulator input bits
+
+B001h: FFFFFFFF
+
+ F - Lower 8 bits of frequency data
+
+B002h: X---FFFF
+
+ X - Channel disable. 0=channel disabled, 1=channel enabled.
+
+ F - Upper 4 bits of frequency data
+
+
+
+How the sounds are formed:
+--------------------------
+
+This chip is pretty cool. It outputs a 6 bit binary word for the sound
+which is passed through a DAC and finally to the NES/Fami. Because of this,
+the sound can be emulated *very* close to the original.
+
+
+I used my scope to figure all this out as well as my meter and logic probe
+so it should be 100% accurate.
+
+
+
+Block diagrams of the VRCVI: (as reverse engineered by me)
+----------------------------
+
+
+ | F bits | | D bits| | V bits |
+ | (12) | | (3) | | (4) |
+ \___________/ \_______/ \________/
++-----+ +----------------+ +-----------+ +------------+
+| | | | | | | |--\
+| OSC |-->|Divider (12 bit)|-->| Duty Cycle|-->| AND array |(4)> chan out
+|(M2) | | | | Generator | | |--/
++-----+ +----------------+ +-----------+ +------------+
+ ^ ^
+ | |
+ | |
+ G X
+
+ One Pulse channel (both are identical)
+ --------------------------------------
+
+
+How it works: The oscillator (in the NES, the clock arrives on the M2 line
+and is about 1.78Mhz) generates our starting frequency. This is passed
+first into a divide by 1 to 4096 divider to generate a base frequency.
+This is then passed to the duty cycle generator. The duty cycle generator
+generates the desired duty cycle for the output waveform. The "D" input
+controls the duty cycle generator's duty cycle. If the "G" bit is
+a "1", it forces the output of the duty cycle generator to a "1" also. If
+the "X" bit is "0", it forces the output of the duty cycle generator to "0",
+which effectively disables the channel. Note that this input has precidence
+over the "G" bit.
+
+The AND array is just that- an array of 4 AND gates. If the output of
+the duty cycle generator is a "0", then the "chan out" outputs will all be
+forced to "0". If the output of the duty cycle generator is a "1", then
+the chan out outputs will follow the V bit inputs.
+
+Note that the output of this generator is a 4 bit binary word.
+
+
+---
+
+
+ | F bits | | P bits|
+ | (12) | | (6) |
+ \___________/ \_______/
++-----+ +----------------+ +-----------+
+| | | | | |--\
+| OSC |-->|Divider (12 bit)|-->| Phase |(5)> chan out
+|(M2) | | | |Accumulator|--/
++-----+ +----------------+ +-----------+
+ ^
+ |
+ |
+ X
+
+ The Sawtooth (ramp) channel
+ ---------------------------
+
+
+This one is pretty similar to the above when it comes to frequency selection.
+The output frequency will be the same relative to the square wave channels.
+OK, the tough part will be explaining the phase accumulator. :-) What it is
+is just an adder tied to a latch. Every clock it adds a constant to the
+latch. In the case of the VRCVI, what you do is this:
+
+The ramp is generated over the course of 7 evenly spaced cycles, generated
+from the divider. Every clock from the divider causes the phase accumulator
+to add once. So... let's say we have 03h in the P bits. Every 7 cycles
+the phase accumulator (which is 8 bits) is reset to 00h.
+
+
+cycle: accumulator: chan out: notes:
+-----------------------------------------
+0 00h 00h On the first cycle, the acc. is reset to 0
+1 03h 00h We add 3 to 0 to get 3
+2 06h 00h We add 3 to 3 to get 6
+3 09h 01h
+4 0ch 01h
+5 0fh 01h
+6 12h 02h
+7 00h 00h Reset the acc. back to 0 and do it again
+
+
+This will look like so: (as viewed on an oscilloscope)
+
+
+
+ - - - 2
+ --- --- --- 1
+--- --- --- 0
+ |
+012345601234560123456-+
+
+
+
+Note: if you enter a value that is too large (i.e. 30h) the accumulator
+*WILL WRAP*. Yes, this doesn't sound very good at all and you no longer
+have a sawtooth. ;-)
+
+
+The upper 5 bits of said accumulator are run to the "chan out" outputs.
+The lower 3 bits are not run anywhere.
+
+"X" disables the phase accumulator and forces all outputs to "0".
+Note that the output of this generator is a 5 bit word.
+
+
+---
+
+Now that the actual sound generation is out of the way, here's how the
+channels are combined into the final 6 bit binary output:
+
+
++---------+
+| Pulse |
+|Generator|
+| #1 | Final 6 Bit
++---------+ Output
+ | (4) | / \
+ \ / | (6) |
++---------+ +---------+ +---------+
+| 4 Bit |--\ | 5 Bit | /--|Sawtooth |
+| Binary |(5)>| Binary |<(5)|Generator|
+| Adder |--/ | Adder | \--| |
++---------+ +---------+ +---------+
+ / \
+ | (4) |
++---------+
+| Pulse |
+|Generator|
+| #2 |
++---------+
+
+ Channel Combining
+ -----------------
+
+
+The three channels are finally added together through a series of adders
+to produce the final output word. The two pulse chans are most likely added
+first since they are 4 bit words, and that 5 bit result is most likely
+added to the sawtooth's output. (The actual adding order is not known,
+but I can make a *very* good guess. The above illustrated way uses the least
+amount of transistors). In the end it does not matter the order in which
+the words are added; the final word will always be the same.
+
+The final 6 bit output word is run through an "R2R" resistor ladder which
+turns the digital bits into a 64 level analog representation. The ladder
+is binarally weighted and works like the DAC on your soundcard. :-)
+(so take heart emulator authours: just run the finished 6 bit word to
+your soundcard and it will sound right ;-).
+
+
+
+Frequency Generation:
+---------------------
+
+The chip generates all its output frequencies based on M2, which is
+colourburst divided by two (1789772.7272Hz). This signal is passed
+directly into the programmable dividers (the 12 bit frequency regs).
+
+Squares:
+--------
+
+These take the output of the programmable divider and then run it through
+the duty cycle generator, which in the process of generating the duty cycle,
+divides the frequency by 16.
+
+
+To calculate output frequency:
+
+ 1789772.7272
+Fout = ----------------
+ (freq_in+1) * 16
+
+
+This will tell you the exact frequency in Hz. (note that the * 16 is to
+compensate for the divide by 16 duty cycle generator.)
+
+Saw:
+----
+
+This is similar to the above, however the duty cycle generator is replaced
+with a phase accumulator which divides the output frequency by 14.
+
+
+To calculate output frequency:
+
+ 1789772.7272
+Fout = ----------------
+ (freq_in+1) * 14
+
+
+This will tell you the exact frequency in Hz. (note that the * 14 is to
+compensate for the phase accumulator.)
+
+
+
+So how accurate is this info, anyways?
+--------------------------------------
+
+I believe the info to be 100% accurate. I have extensively tested the
+output of the actual VRCVI chip to this spec and everything fits perfectly.
+I did this by using a register dump and a QBASIC program I wrote which
+takes the register dump and produces a WAV file. All frequency and
+duty cycle measurements were taken with a Fluke 83 multimeter, and all
+waveform data was culled from my oscilloscope measuring the real chip.
+
+
+
+
+---EOF---
diff --git a/documentation/tech/exp/vrcvii.txt b/documentation/tech/exp/vrcvii.txt
new file mode 100644
index 00000000..fc07d05e
--- /dev/null
+++ b/documentation/tech/exp/vrcvii.txt
@@ -0,0 +1,321 @@
+ VRCVII CHIP INFO
+ ------ ---- ----
+
+ By:
+
+
+ Kevin Horton
+ khorton@iquest.net
+
+
+
+
+ The RENES Project:
+ Reverse-engineering
+ the world.
+
+
+
+V0.10 11/05/99 Document started, pinned out chip and audio thingy
+V0.20 11/10/99 Added very, very, very preliminary register findings
+v1.00 11/14/99 First release version of this doc
+
+VRCVII (VRC7) (48 pin standard 600mil wide DIP)
+-------------
+
+This chip is used in only one Konami game that I know of- Lagrange Point.
+I heard rumours it was used in another game, so if someone could provide
+info and/or a ROM image, that would help immensely. It handles ROM
+bankswitching as well as sound generation. The sound generation is done
+using FM synthesis, so the music sounds like "Adlib" OPL2 music. Due to
+extra sound, this is a Famicom-only chip like its cousin the
+VRCVI. (See the VRCVI doc for more info)
+
+"VII" of "VRCVII" is "7" for the roman numeral challenged.
+
+
+This chip appears to generate all of its audio internally, which is then
+fed to a small hybrid (aka "black blob") that does the audio interfacing
+to the Famicom proper. It is physically a small ceramic substrate with
+an 8 pin SMD chip on it (probably an op-amp), and what appears to be three
+0805 sized SMD chip parts (capacitors most likely, since resistors can be
+formed on the substrate). The whole works is then coated with a black
+dipped epoxy coating, and the smooth side (opposite the parts) is then
+marked with an identifying part number and the pin 1 dot.
+
+
+Here's the pinout for it:
+
+ Front Side (parts facing away)
+
+ +-----------------+ 1- Audio in from Famicom
+ | 054002 | 2- Audio out to Famicom
+ |@ | 3,7 - Ground
+ +-----------------+ 4-6 - NC
+ | | | | | | | | | 8- Audio from VRCVII
+ 1 2 3 4 5 6 7 8 9 9- +5V
+
+
+
+
+
+Legend:
+-------
+
+(s) means this pin connects to the System
+(r) this only connects to the ROM
+(w) this is a SRAM/WRAM connection only
+PRG : these connect to the PRG ROM and/or fami's PRG pins
+WRAM : this hooks to the WRAM
+
+Note: There is a 3.58Mhz ceramic resonator connected to the "X1" and "X2"
+pins. it is the three-pin style with internal caps tied to the third pin
+which is grounded.
+
+Chip is physically marked: "VRV VII 053982"
+
+ .----\/----.
+ *1 (RAM&s) CHR /OE - |01 48| - NC
+ *1 (RAM&s) CHR /CE - |02 47| - M2 (s)
+ GND - |03 46| - /CE WRAM (w)
+ (s) R/W - |04 45| - PRG /A15 (s) (aka /CE)
+ (s) /IRQ - |05 44| - PRG ROM /CE (r)
+ (s) CIRAM A11 - |06 43| - Audio Out
+ (s) PD0 - |07 42| - +5V
+ (s) PD1 - |08 41| - NC
+ (s) PD2 - |09 40| - NC
+ (s) PD3 - |10 39| - NC
+ (s) PD4 - |11 38| - NC
+ (s) PD5 - |12 37| - NC
+ (s) PD6 - |13 36| - CHR RAM A12
+ (s) PD7 - |14 35| - CHR RAM A11
+ +5V - |15 34| - CHR RAM A10
+ (s) PRG A5 - |16 33| - CHR A12 (s)
+ Crystal X2 - |17 32| - CHR A11 (s)
+ Crystal X1 - |18 31| - CHR A10 (s)
+ (s) PRG A4 - |19 30| - +5V
+ (r) PRG ROM A13 - |20 29| - PRG A14 (s)
+ (r) PRG ROM A14 - |21 28| - PRG A13 (s)
+ (r) PRG ROM A15 - |22 27| - PRG A12 (s)
+ (r) PRG ROM A16 - |23 26| - PRG ROM A18 (r)
+ GND - |24 25| - PRG ROM A17 (r)
+ | |
+ `----------'
+
+ VRCVII
+
+
+*1: these connect to both the CHR RAM's pins and the card edge.
+
+Note: the NC pins 37-41 most likely for CHR ROM bankswitching. Since this
+cart uses CHR RAM these obviously weren't used ;-)
+
+Registers: (sound related only)
+----------
+
+All sound registers are accessed through only two physical registers.
+
+9010:
+-----
+
+This is the index register. You write the desired register number here.
+
+9030:
+-----
+
+This is the data register. Data written here is stored in the register
+pointed to by the above index register.
+
+There are 6 channels, each containing three registers, and 8 custom
+instrument control registers.
+
+
+Sound Registers:
+----------------
+
+00h - 07h : Custom instrument registers. See below for info.
+
+---
+
+10h - 15h : ffffffff
+
+f: Lower 8 bits of frequency
+
+---
+
+20h - 25h : ???tooof
+
+f: Upper bit of frequency
+o: Octave Select
+t: Channel trigger.
+?: Dunno what these do yet (No audible effect)
+
+---
+
+30h - 35h : iiiivvvv
+
+i: Instrument number
+v: Volume
+
+Instrument numbers 01h-0fh are fixed and cannot be changed.
+
+Instrument number 00h is the "programmable" one.
+
+To program the custom instrument, you load registers 00h-07h with the
+desired parameters for it. All channels set to instrument 00h will
+then use this instrument. Note that you can only program one custom
+instrument at a time.
+
+
+
+How do the frequency registers work?
+------------------------------------
+
+To generate a tone, you must select an octave and a frequency value. The
+frequency values stay the same for say, the note "C", while the octave
+bits determine which octave "C" lies in. This makes your note lookup table
+quite small.
+
+o = 000 is octave 0
+o = 001 is octave 1
+.
+.
+.
+o = 111 is octave 7
+
+
+ 49722*freqval
+F = -------------
+ 2^(19-octave)
+
+
+Where:
+
+F = output frequency in Hz
+freqval = frequency register value
+octave = desired octave (starting at 0)
+
+
+Custom Instrument Registers (00-07)
+-----------------------------------
+
+Note: I will not provide too extensive documentation of the instrument
+registers since their functions are identical to those of the OPL2 chip,
+commonly found on Adlib/Soundblaster/compatible cards, and there is alot
+of information out on how to program these. I will use terminology
+similar to that found in said documents. My VRC7 "emulator" test program
+I wrote simply re-arranged and tweaked the register writes to correspond
+with the OPL2 registers.
+
+Here's a link to a good document about this chip:
+
+http://www.ccms.net/~aomit/oplx/
+
+The tremolo depth is set to 4.3db and the vibrato depth is set to 14 cent
+(in reguards to OPL2 settings; to achieve this you would write 0C0h to
+OPL register 0BDh). All operator connections are fixed in FM mode. (Where
+Modulator modulates the Carrier).
+
+---
+
+
+00 (Modulator) - tvskmmmm
+01 (Carrier)
+
+t: Tremolo Enable
+v: Vibrato Enable
+s: Sustain Enable
+k: KSR
+m: Multiplier
+
+---
+
+02 - kkoooooo
+
+k: Key Scale Level
+o: Output Level
+
+---
+
+03 - ---qweee
+
+-: Not used: Write 0's
+q: Carrier Waveform
+w: Modulator Waveform
+
+Note: There are only two waveforms available. Sine and rectified sine (only
+ the positive cycle of the sine; negative cycle "chopped off".)
+
+e: Feedback Control
+
+---
+
+04 (Modulator) - aaaadddd
+05 (Carrier)
+
+a: Attack
+d: Decay
+
+---
+
+06 (Modulator) - ssssrrrr
+07 (Carrier)
+
+s: Sustain
+r: Release
+
+
+
+Register Settings for the 15 fixed instruments.
+-----------------------------------------------
+
+*CAUTION*CAUTION*CAUTION*CAUTION*CAUTION*CAUTION*CAUTION*CAUTION*CAUTION*
+C C
+A These instruments are not 100% correct! There is no way to extract A
+U the register settings from the chip short of an electron microscope. U
+T I have "tuned" these instruments best I could, though I know a couple T
+I are not exactly right. Use them at your own perl! If someone wants I
+O to waste all day tuning a new set, please let me know what you get. O
+N N
+*CAUTION*CAUTION*CAUTION*CAUTION*CAUTION*CAUTION*CAUTION*CAUTION*CAUTION*
+
+
+ Register
+ --------
+
+ 00 01 02 03 04 05 06 07
+ -----------------------
+ 0 | -- -- -- -- -- -- -- --
+ 1 | 05 03 10 06 74 A1 13 F4
+ 2 | 05 01 16 00 F9 A2 15 F5
+ 3 | 01 41 11 00 A0 A0 83 95
+ 4 | 01 41 17 00 60 F0 83 95
+ 5 | 24 41 1F 00 50 B0 94 94
+ 6 | 05 01 0B 04 65 A0 54 95
+ 7 | 11 41 0E 04 70 C7 13 10
+ Instrument 8 | 02 44 16 06 E0 E0 31 35
+ ---------- 9 | 48 22 22 07 50 A1 A5 F4
+ A | 05 A1 18 00 A2 A2 F5 F5
+ B | 07 81 2B 05 A5 A5 03 03
+ C | 01 41 08 08 A0 A0 83 95
+ D | 21 61 12 00 93 92 74 75
+ E | 21 62 21 00 84 85 34 15
+ F | 21 62 0E 00 A1 A0 34 15
+
+
+
+
+So how accurate is this info, anyways?
+--------------------------------------
+
+I believe the info to be 100% accurate. The pinout was generated with the
+help of a multimeter and my Super-8 with both an NES cart and the Fami cart
+plugged in. (this allows me to measure between my known "control" board
+with the unknown "experimental" board to generate the connections).
+Register info was gleaned via writing test code and listening and capturing
+the resultant audio stream for analysis.
+
+
+
+
+---EOF---
diff --git a/documentation/tech/nsfspec.txt b/documentation/tech/nsfspec.txt
new file mode 100644
index 00000000..2ef526d2
--- /dev/null
+++ b/documentation/tech/nsfspec.txt
@@ -0,0 +1,336 @@
+ NES Music Format Spec
+ ---------------------
+
+
+By: Kevin Horton khorton@iquest.net
+
+
+NOTE:
+-----
+
+
+Remember that I am very willing to add stuff and update this spec. If
+you find a new sound chip or other change let me know and I will get back
+with you. E-mail to the above address.
+
+
+V1.61 - 06/27/2000 Updated spec a bit
+V1.60 - 06/01/2000 Updated Sunsoft, MMC5, and Namco chip information
+V1.50 - 05/28/2000 Updated FDS, added Sunsoft and Namco chips
+V1.32 - 11/27/1999 Added MMC5 register locations
+V1.30 - 11/14/1999 Added MMC5 audio bit, added some register info
+V1.20 - 09/12/1999 VRC and FDS prelim sound info added
+V1.00 - 05/11/1999 First official NSF specification file
+
+
+
+This file encompasses a way to transfer NES music data in a small, easy to
+use format.
+
+The basic idea is one rips the music/sound code from an NES game and prepends
+a small header to the data.
+
+A program of some form (6502/sound emulator) then takes the data and loads
+it into the proper place into the 6502's address space, then inits and plays
+the tune.
+
+Here's an overview of the header:
+
+offset # of bytes Function
+----------------------------
+
+0000 5 STRING "NESM",01Ah ; denotes an NES sound format file
+0005 1 BYTE Version number (currently 01h)
+0006 1 BYTE Total songs (1=1 song, 2=2 songs, etc)
+0007 1 BYTE Starting song (1= 1st song, 2=2nd song, etc)
+0008 2 WORD (lo/hi) load address of data (8000-FFFF)
+000a 2 WORD (lo/hi) init address of data (8000-FFFF)
+000c 2 WORD (lo/hi) play address of data (8000-FFFF)
+000e 32 STRING The name of the song, null terminated
+002e 32 STRING The artist, if known, null terminated
+004e 32 STRING The Copyright holder, null terminated
+006e 2 WORD (lo/hi) speed, in 1/1000000th sec ticks, NTSC (see text)
+0070 8 BYTE Bankswitch Init Values (see text, and FDS section)
+0078 2 WORD (lo/hi) speed, in 1/1000000th sec ticks, PAL (see text)
+007a 1 BYTE PAL/NTSC bits:
+ bit 0: if clear, this is an NTSC tune
+ bit 0: if set, this is a PAL tune
+ bit 1: if set, this is a dual PAL/NTSC tune
+ bits 2-7: not used. they *must* be 0
+007b 1 BYTE Extra Sound Chip Support
+ bit 0: if set, this song uses VRCVI
+ bit 1: if set, this song uses VRCVII
+ bit 2: if set, this song uses FDS Sound
+ bit 3: if set, this song uses MMC5 audio
+ bit 4: if set, this song uses Namco 106
+ bit 5: if set, this song uses Sunsoft FME-07
+ bits 6,7: future expansion: they *must* be 0
+007c 4 ---- 4 extra bytes for expansion (must be 00h)
+0080 nnn ---- The music program/data follows
+
+This may look somewhat familiar; if so that's because this is somewhat
+sorta of based on the PSID file format for C64 music/sound.
+
+
+Loading a tune into RAM
+-----------------------
+
+If offsets 0070h to 0077h have 00h in them, then bankswitching is *not*
+used. If one or more bytes are something other than 00h then bankswitching
+is used. If bankswitching is used then the load address is still used,
+but you now use (ADDRESS AND 0FFFh) to determine where on the first bank
+to load the data.
+
+
+Each bank is 4K in size, and that means there are 8 of them for the
+entire 08000h-0ffffh range in the 6502's address space. You determine where
+in memory the data goes by setting bytes 070h thru 077h in the file.
+These determine the inital bank values that will be used, and hence where
+the data will be loaded into the address space.
+
+Here's an example:
+
+METROID.NSF will be used for the following explaination.
+
+The file is set up like so: (starting at 070h in the file)
+
+
+0070: 05 05 05 05 05 05 05 05 - 00 00 00 00 00 00 00 00
+0080: ... music data goes here...
+
+Since 0070h-0077h are something other than 00h, then we know that this
+tune uses bankswitching. The load address for the data is specified as
+08000h. We take this AND 0fffh and get 0000h, so we will load data in
+at byte 0 of bank 0, since data is loaded into the banks sequentially
+starting from bank 0 up until the music data is fully loaded.
+
+Metroid has 6 4K banks in it, numbered 0 through 5. The 6502's address
+space has 8 4K bankswitchable blocks on it, starting at 08000h-08fffh,
+09000h-09fffh, 0a000h-0afffh ... 0f000h-0ffffh. Each one of these is 4K in
+size, and the current bank is controlled by writes to 05ff8h thru 05fffh,
+one byte per bank. So, 05ff8h controls the 08000h-08fffh range, 05ff9h
+controls the 09000h-09fffh range, etc. up to 05fffh which controls the
+0f000h-0ffffh range. When the song is loaded into RAM, it is loaded into
+the banks and not the 6502's address space. Once this is done, then the
+bank control registers are written to set up the inital bank values.
+To do this, the value at 0070h in the file is written to 05ff8h, 0071h
+is written to 05ff9h, etc. all the way to 0077h is written to 05fffh.
+This is should be done before every call to the init routine.
+
+If the tune was not bankswitched, then it is simply loaded in at the
+specified load address, until EOF
+
+
+Initalizing a tune
+------------------
+
+This is pretty simple. Load the desired song # into the accumulator,
+minus 1 and set the X register to specify PAL (X=1) or NTSC (X=0).
+If this is a single standard tune (i.e. PAL *or* NTSC but not both)
+then the X register contents should not matter. Once the song # and
+optional PAL/NTSC standard are loaded, simply call the INIT address.
+Once init is done, it should perform an RTS.
+
+
+Playing a tune
+--------------
+
+Once the tune has been initalized, it can now be played. To do this,
+simply call the play address several times a second. How many times
+per second is determined by offsets 006eh and 006fh in the file.
+These bytes denote the speed of playback in 1/1000000ths of a second.
+For the "usual" 60Hz playback rate, set this to 411ah.
+
+To generate a differing playback rate, use this formula:
+
+
+ 1000000
+PBRATE= ---------
+ speed
+
+Where PBRATE is the value you stick into 006e/006fh in the file, and
+speed is the desired speed in hertz.
+
+
+"Proper" way to load the tune
+-----------------------------
+
+1) If the tune is bankswitched, go to #3.
+
+2) Load the data into the 6502's address space starting at the specified
+ load address. Go to #4.
+
+3) Load the data into a RAM area, starting at (start_address AND 0fffh).
+
+4) Tune load is done.
+
+
+"Proper" way to init a tune
+---------------------------
+
+1) Clear all RAM at 0000h-07ffh.
+
+2) Clear all RAM at 6000h-7fffh.
+
+3) Init the sound registers by writing 00h to 04000-0400Fh, 10h to 4010h,
+ and 00h to 4011h-4013h.
+
+4) Set volume register 04015h to 00fh.
+
+5) If this is a banked tune, load the bank values from the header into
+ 5ff8-5fffh.
+
+6) Set the accumulator and X registers for the desired song.
+
+7) Call the music init routine.
+
+
+"Proper" way to play a tune
+---------------------------
+
+1) Call the play address of the music at periodic intervals determined
+ by the speed words. Which word to use is determined by which mode
+ you are in- PAL or NTSC.
+
+
+Sound Chip Support
+------------------
+
+Byte 007bh of the file stores the sound chip flags. If a particular flag
+is set, those sound registers should be enabled. If the flag is clear,
+then those registers should be disabled.
+
+* VRCVI Uses registers 9000-9002, A000-A002, and B000-B002, write only.
+
+Caveats: 1) The above registers are *write only* and must not disrupt music
+ code that happens to be stored there.
+
+ 2) Major caveat: The A0 and A1 lines are flipped on a few games!!
+ If you rip the music and it sounds all funny, flip around
+ the xxx1 and xxx2 register pairs. (i.e. 9001 and 9002) 9000
+ and 9003 can be left untouched. I decided to do this since it
+ would make things easier all around, and this means you only
+ will have to change the music code in a very few places (6).
+ Esper2 and Madara will need this change, while Castlevania 3j
+ will not for instance.
+
+ 3) See my VRCVI.TXT doc for a complete register description.
+
+* VRCVII Uses registers 9010 and 9030, write only.
+
+Caveats: 1) Same caveat as #1, above.
+
+ 2) See my VRCVII.TXT doc for a complete register description.
+
+* FDS Sound uses registers from 4040 through 4092.
+
+Caveats: 1) 6000-DFFF is assumed to be RAM, since 6000-DFFF is RAM on the
+ FDS. E000-FFFF is usually not included in FDS games because
+ it is the BIOS ROM. However, it can be used on FDS rips to help
+ the ripper (for modified play/init addresses).
+
+ 2) Bankswitching operates slightly different on FDS tunes.
+ 5FF6 and 5FF7 control the banks 6000-6FFF and 7000-7FFF
+ respectively. NSF header offsets 76h and 77h correspond to
+ *both* 6000-7FFF *AND* E000-FFFF. Keep this in mind!
+
+* MMC5 Sound Uses registers 5000-5015, write only as well as 5205 and 5206,
+ and 5C00-5FF5
+
+Caveats: 1) Generating a proper doc file. Be patient.
+
+ 2) 5205 and 5206 are a hardware 8*8 multiplier. The idea being
+ you write your two bytes to be multiplied into 5205 and 5206
+ and after doing so, you read the result back out. Still working
+ on what exactly triggers it (I think a write to either 5205
+ or 5206 triggers the multiply).
+
+ 3) 5C00-5FF5 should be RAM to emulate EXRAM while in MMC5 mode.
+
+Note: Thanks to Mamiya for the EXRAM info.
+
+
+* Namco 106 Sound Uses registers 4800 and F800.
+
+ This works similar to VRC7. 4800 is the "data" port which is
+ readable and writable, while F800 is the "address" port and is
+ writable only.
+
+ The address is 7 bits plus a "mode" bit. Bit 7 controls
+ address auto-incrementing. If bit 7 is set, the address will
+ auto-increment after a byte of data is read or written from/to
+ 4800.
+
+ $40 ffffffff f:frequency L
+ $42 ffffffff f:frequency M
+ $44 ---sssff f:frequency H s:tone length (8-s)*4 in 4bit-samples
+ $46 tttttttt t:tone address(4bit-address,$41 means high-4bits of $20)
+ $47 -cccvvvv v:linear volume 1+c:number of channels in use($7F only)
+ $40-47:ch1 $48-4F:ch2 ... $78-7F:ch8
+ ch2-ch8 same to ch1
+
+ $00-3F(8ch)...77(1ch) hhhhllll tone data
+ h:odd address data(signed 4bit)
+ l:even address data(signed 4bit)
+
+ real frequency = (f * NES_BASECYCLES) / (40000h * (c+1) * (8-s)*4 * 45)
+ NES_BASECYCLES 21477270(Hz)
+
+Note: Very Special thanks to Mamiya for this information!
+
+
+* Sunsoft FME-07 Sound uses registers C000 and E000
+
+ This is similar to the common AY 3-8910 sound chip that is
+ used on tons of arcade machines, and in the Intellivision.
+
+ C000 is the address port
+ E000 is the data port
+
+ Both are write-only, and behave like the AY 3-8910.
+
+Note: Special thanks to Mamiya for this information as well
+
+
+Caveats
+-------
+
+1) The starting song number and maximum song numbers start counting at
+ 1, while the init address of the tune starts counting at 0. To
+ "fix", simply pass the desired song number minus 1 to the init
+ routine.
+
+2) The NTSC speed word is used *only* for NTSC tunes, or dual PAL/NTSC tunes.
+ The PAL speed word is used *only* for PAL tunes, or dual PAL/NTSC tunes.
+
+3) The length of the text in the name, artist, and copyright fields must
+ be 31 characters or less! There has to be at least a single NULL byte
+ (00h) after the text, between fields.
+
+4) If a field is not known (name, artist, copyright) then the field must
+ contain the string "<?>" (without quotes).
+
+5) There should be 8K of RAM present at 6000-7FFFh. MMC5 tunes need RAM at
+ 5C00-5FF7 to emulate its EXRAM. 8000-FFFF Should be read-only (not
+ writable) after a tune has loaded. The only time this area should be
+ writable is if an FDS tune is being played.
+
+6) Do not assume the state of *anything* on entry to the init routine
+ except A and X. Y can be anything, as can the flags.
+
+7) Do not assume the state of *anything* on entry to the play routine either.
+ Flags, X, A, and Y could be at any state. I've fixed about 10 tunes
+ because of this problem and the problem, above.
+
+8) The stack sits at 1FFh and grows down. Make sure the tune does not
+ attempt to use 1F0h-1FFh for variables. (Armed Dragon Villigust did and
+ I had to relocate its RAM usage to 2xx)
+
+9) Variables should sit in the 0000h-07FFh area *only*. If the tune writes
+ outside this range, say 1400h this is bad and should be relocated.
+ (Terminator 3 did this and I relocated it to 04xx).
+
+That's it!
+
+
+
diff --git a/documentation/tech/ppu/2c02 technical operation.txt b/documentation/tech/ppu/2c02 technical operation.txt
new file mode 100644
index 00000000..3a79008b
--- /dev/null
+++ b/documentation/tech/ppu/2c02 technical operation.txt
@@ -0,0 +1,296 @@
+*******************************
+*NTSC 2C02 technical operation*
+*******************************
+Brad Taylor (big_time_software@hotmail.com)
+
+1st release: Sept 25th, Y2K
+2nd release: Jan 27th, 2K3
+3rd release: Feb 4th, 2K3
+4th release: Feb 19th, 2K3
+
+
+ This document describes the low-level operation and technical details of the 2C02, the NES's PPU. In general, it contains important information in regards to PPU timing, which no NES coder/emulator author should be without. This document assumes that you already understand the basics of how the PPU works, like how the playfield/object images are generated, and the behaviour of scroll/address counters during playfield rendering.
+
+ Alot of the concepts behind how the PPU works described here have been extracted from Nintendo's patent documentation (U.S.#4,824,106). With block diagrams of the PPU's architecture (and even some schematics), these papers will definetely aid in the comprehension of this complex device.
+
+ Since the first release, this document has been given a major overhaul. Most sections of the document have been reworked, and new information has been added just about everywhere. If you've read the old version of this document before, I recommend that you read this new one in it's entirity; there's new information even in sections which may look like they haven't changed much.
+
+ Topics discussed hereon are as follows.
+
+ - Video signal generation
+ - PPU base timing
+ - Miscellanious PPU info
+ - PPU memory access cycles
+ - Frame rendering details
+ - Scanline rendering details
+ - In-range object evaluation
+ - Details of playfield render pipeline
+ - Details of object pattern fetch & render
+ - Extra cycle frames
+ - The MMC3's scanline counter
+ - PPU pixel priority quirk
+ - Graphical enhancements
+
+
++-------+
+|History|
++-------+
+ On the weekend of Sept. 25th, Y2K, I setup an experiment with my NTSC NES MB & my PC so's I could RE the PPU's timing. What I did was (using a PC interface) analyse the changes that occur on the PPU's address and data pins on every rising & falling edge of the PPU's clock. I was not planning on removing the PPU from the motherboard (yet), so basically I just kept everything intact (minus the stuff I added onto the MB so I could monitor the PPU's signals), and popped in a game, so that it would initialize the PPU for me (I used DK classics, since it was only taking somthing like 4 frames before it was turning on the background/sprites).
+
+ The only change I made was taking out the 21 MHz clock generator circuitry. To replace the clock signal, I connected a port controlled latch to the NES's main clock line instead. Now, by writing a 0 or a 1 out to an PC ISA port of my choice (I was using $104), I was able to control the 21 MHz clockline of the NES. After I would create a rise or a fall on the NES's clock line, I would then read in the data that appeared on the PPU's address and data pins, which included monitoring what PPU registers the game read/wrote to (& the data that was read/written).
+
+
++-----------------------+
+|Video signal generation|
++-----------------------+
+ A 21.48 MHz clock signal is fed into the PPU. This is the NES's main clock line, which is shared by the CPU.
+
+ Inside the PPU, the 21.48 MHz signal is used to clock a three-stage Johnson counter. The complimentery outputs of both master and slave portions of each stage are used to form 12 mutually exclusive output phases- all 3.58 MHz each (the NTSC colorburst). These 12 different phases form the basis of all color generation for the PPU's composite video output.
+
+ Naturally, when the user programs the lower 4-bits of a palette register, they are essentially selecting any 1 of 12 phases to be routed to the PPU's video out pin (this corresponds to chrominance (tint/hue) video information) when the appropriate pixel indexes it. Other chrominance combinations (0 & 13) are simply hardwired to a 1 or 0 to generate grayscale pixels.
+
+ Bits 4 & 5 of a palette entry selects 1 of 4 linear DC voltage offsets to apply to the selected chrominance signal (this corresponds to luminance (brightness) video information) for a pixel.
+
+ Chrominance values 14 & 15 yield a black pixel color, regardless of any luminance value setting.
+
+ Luminance value 0, mixed with chrominance value 13 yield a "blacker than black" pixel color. This super black pixel has an output voltage level close to the vertical/horizontal syncronization pulses. Because of this, some video monitors will display warped/distorted screens for games which use this color for black (Game Genie is the best example of this). Essentially what is happening is the video monitor's horizontal timing is compromised by what it thinks are extra syncronization pulses in the scanline. This is not damaging to the monitors which are effected by it, but use of the super black color should be avoided, due to the graphical distortion it causes.
+
+ The amplitude of the selected chrominance signal (via the 4 lower bits of a palette register) remain constant regardless of bits 4 or 5. Thus it is not possible to adjust the saturation level of a particular color.
+
+
++---------------+
+|PPU base timing|
++---------------+
+ Other than the 3-stage Johnson counter, the 21.48 MHz signal is not used directly by any other PPU hardware. Instead, the signal is divided by 4 to get 5.37 MHz, and is used as the smallest unit of timing in the PPU. All following references to PPU clock cycle (abbr. "cc") timing in this document will be in respect to this timing base, unless otherwise indicated.
+
+ - Pixels are rendered at the same rate as the base PPU clock. In other words, 1 clock cycle= 1 pixel.
+
+ - 341 PPU cc's make up the time of a typical scanline (or 341/3 CPU cc's).
+
+ - One frame consists of 262 scanlines. This equals 341*262 PPU cc's per frame (divide by 3 for # of CPU cc's).
+
+
++------------------------+
+|PPU memory access cycles|
++------------------------+
+ All PPU memory access cycles are 2 clocks long, and can be made back-to-back (typically done during rendering). Here's how the access breaks down:
+
+ At the beginning of the access cycle, PPU address lines 8..13 are updated with the target address. This data remains here until the next time an access cycle occurs.
+
+ The lower 8-bits of the PPU address lines are multiplexed with the data bus, to reduce the PPU's pin count. On the first clock cycle of the access, A0..A7 are put on the PPU's data bus, and the ALE (address latch enable) line is activated for the first half of the cycle. This loads the lower 8-bit address into an external 8-bit transparent latch strobed by ALE (74LS373 is used).
+
+ On the second clock cycle, the /RD (or /WR) line is activated, and stays active for the entire cycle. Appropriate data is driven onto the bus during this time.
+
+
++----------------------+
+|Miscellanious PPU info|
++----------------------+
+ - Sprite DMA is 1536 clock cycles long (512 CPU cc's). 256 individual transfers are made from CPU memory to a temp register inside the CPU, then from the CPU's temp reg, to $2004.
+
+ - The PPU makes NO external access to the PPU bus, unless the playfield or objects are enabled during a scanline outside vblank. This means that the PPU's address and data busses are dead while in this state.
+
+ - palette RAM is accessed internally during playfield rendering (i.e., the palette address/data is never put on the PPU bus during this time). Additionally, when the programmer accesses palette RAM via $2006/7, the palette address accessed actually does show up on the PPU address bus, but the PPU's /RD & /WR flags are not activated. This is required; to prevent writing over name table data falling under the approprite mirrored area (since the name table RAM's address decoder simply consists of an inverter connected to the A13 line- effectively decoding all addresses in $2000-$3FFF).
+
+ - the VINT impulse (NMI) and bit $2002.7 are set simultaniously. Reading $2002 will reset bit 7, but it seems that the VINT flag goes down on it's own. Because of this, when the PPU generates a VINT, it doesn't require any acknowledgement whatsoever; it will continue firing off VINTs, regardless of inservice to $2002. The only way to stop VINTs is to clear $2000.7.
+
+ - Because the PPU cannot make a read from PPU memory immediately upon request (via $2007), there is an internal buffer, which acts as a 1-stage data pipeline. As a read is requested, the contents of the read buffer are returned to the NES's CPU. After this, at the PPU's earliest convience (according to PPU read cycle timings), the PPU will fetch the requested data from the PPU memory, and throw it in the read buffer. Writes to PPU mem via $2007 are pipelined as well, but it is unknown to me if the PPU uses this same buffer (this could be easily tested by writing somthing to $2007, and seeing if the same value is returned immediately after reading).
+
+
++-----------------------+
+|Frame rendering details|
++-----------------------+
+ The following describes the PPU's status during all 262 scanlines of a frame. Any scanlines where work is done (like image rendering), consists of the steps which will be described in the next section.
+
+ 0..19: Starting at the instant the VINT flag is pulled down (when a NMI is generated), 20 scanlines make up the period of time on the PPU which I like to call the VINT period. During this time, the PPU makes no access to it's external memory (i.e. name / pattern tables, etc.).
+
+ 20: After 20 scanlines worth of time go by (since the VINT flag was set), the PPU starts to render scanlines. This first scanline is a dummy one; although it will access it's external memory in the same sequence it would for drawing a valid scanline, no on-screen pixels are rendered during this time, making the fetched background data immaterial. Both horizontal *and* vertical scroll counters are updated (presumably) at cc offset 256 in this scanline. Other than that, the operation of this scanline is identical to any other. The primary reason this scanline exists is to start the object render pipeline, since it takes 256 cc's worth of time to determine which objects are in range or not for any particular scanline.
+
+ 21..260: after rendering 1 dummy scanline, the PPU starts to render the actual data to be displayed on the screen. This is done for 240 scanlines, of course.
+
+ 261: after the very last rendered scanline finishes, the PPU does nothing for 1 scanline (i.e. the programmer gets screwed out of perfectly good VINT time). When this scanline finishes, the VINT flag is set, and the process of drawing lines starts all over again.
+
+
++--------------------------+
+|Scanline rendering details|
++--------------------------+
+ Naturally, the PPU will fetch data from name, attribute, and pattern tables during a scanline to produce an image on the screen. This section details the PPU's doings during this time.
+
+ As explained before, external PPU memory can be accessed every 2 cc's. With 341 cc's per scanline, this gives the PPU enough time to make 170 memory accesses per scanline (and it uses all of them!). After the 170th fetch, the PPU does nothing for 1 clock cycle. Remember that a single pixel is rendered every clock cycle.
+
+
+ Memory fetch phase 1 thru 128
+ -----------------------------
+ 1. Name table byte
+ 2. Attribute table byte
+ 3. Pattern table bitmap #0
+ 4. Pattern table bitmap #1
+
+ This process is repeated 32 times (32 tiles in a scanline).
+
+
+ This is when the PPU retrieves the appropriate data from PPU memory for rendering the playfield. The first playfield tile fetched here is actually the 3rd to be drawn on the screen (the playfield data for the first 2 tiles to be rendered on this scanline are fetched at the end of the scanline prior to this one).
+
+ All valid on-screen pixel data arrives at the PPU's video out pin during this time (256 clocks). For determining the precise delay between when a tile's bitmap fetch phase starts (the whole 4 memory fetches), and when the first pixel of that tile's bitmap data hits the video out pin, the formula is (16-n) clock cycles, where n is the fine horizontal scroll offset (0..7 pixels). This information is relivant for understanding the exact timing operation of the "object 0 collision" flag.
+
+ Note that the PPU fetches an attribute table byte for every 8 sequential horizontal pixels it draws. This essentially limits the PPU's color area (the area of pixels which are forced to use the same 3-color palette) to only 8 horizontally sequential pixels.
+
+ It is also during this time that the PPU evaluates the "Y coordinate" entries of all 64 objects in object attribute RAM (OAM), to see if the objects are within range (to be drawn on the screen) for the *next* scanline (this is why Y-coordinate entries in the OAM must be programmed to a value 1 less than the scanline the object is to appear on). Each evaluation (presumably) takes 4 clock cycles, for a total of 256 (which is why it's done during on-screen pixel rendering).
+
+
+ In-range object evaluation
+ --------------------------
+ An 8-bit comparator is used to calculate the 9-bit difference between the current scanline (minus 21), and each Y-coordinate (plus 1) of every object entry in the OAM. Objects are considered in range if the comparator produces a difference in the range of 0..7 (if $2000.5 currently = 0), or 0..15 (if $2000.5 currently = 1).
+
+ (Note that a 9-bit comparison result is generated. This means that setting object scanline coordinates for ranges -1..-15 are actually interpreted as ranges 241..255. For this reason, objects with these ranges will never be considered to be part of any on-screen scanline range, and will not allow smooth object scrolling off the top of the screen.)
+
+ Tile index (8 bits), X-coordinate (8 bits), & attribute information (4 bits; vertical inversion is excluded) from the in-range OAM element, plus the associated 4-bit result of the range comparison accumulate in a part of the PPU called the "sprite temporary memory". Logical inversion is applied to the loaded 4-bit range comparison result, if the object's vertical inversion attribute bit is set.
+
+ Since object range evaluations occur sequentially through the OAM (starting from entry 0 to 63), the sprite temporary memory always fills in order from the highest priority in-range object, to lower ones. A 4-bit "in-range" counter is used to determine the number of found objects on the scanline (from 0 up to 8), and serves as an index pointer for placement of found object data into the 8-element sprite temporary memory. The counter is reset at the beginning of the object evaluation phase, and is post-incremented everytime an object is found in-range. This occurs until the counter equals 8, when found object data after this is discarded, and a flag (bit 5 of $2002) is raised, indicating that it is going to be dropping objects for the next scanline.
+
+ An additional memory bit associated with the sprite temporary memory is used to indicate that the primary object (#0) was found to be in range. This will be used later on to detect primary object-to-playfield pixel collisions.
+
+
+ Playfield render pipeline details
+ ---------------------------------
+ As pattern table & palette select data is fetched, it is loaded into internal latches (the palette select data is selected from the fetched byte via a 2-bit 1-of-4 selector).
+
+ At the start of a new tile fetch phase (every 8 cc's), both latched pattern table bitmaps are loaded into the upper 8-bits of 2- 16-bit shift registers (which both shift right every clock cycle). The palette select data is also transfered into another latch during this time (which feeds the serial inputs of 2 8-bit right shift registers shifted every clock). The pixel data is fed into these extra shift registers in order to implement fine horizontal scrolling, since the periods when the PPU fetch tile data is fixed.
+
+ A single bit from each shift register is selected, to form the valid 4-bit playfield pixel for the current clock cycle. The bit selection offset is based on the fine horizontal scroll value (this selects bit positions 0..7 for all 4 shift registers). The selected 4-bit pixel data will then be fed into the multiplexer (described later) to be mixed with object data.
+
+
+ Memory fetch phase 129 thru 160
+ -------------------------------
+ 1. Garbage name table byte
+ 2. Garbage name table byte
+ 3. Pattern table bitmap #0 for applicable object (for next scanline)
+ 4. Pattern table bitmap #1 for applicable object (for next scanline)
+
+ This process is repeated 8 times.
+
+
+ This is the period of time when the PPU retrieves the appropriate pattern table data for the objects to be drawn on the *next* scanline. When less than 8 objects exist on the next scanline (as the in-range object evaluation counter indicates), dummy pattern table fetches take place for the remaining fetches. Internally, the fetched dummy-data is discarded, and replaced with completely transparent bitmap patterns).
+
+ Although the fetched name table data is thrown away, and the name table address is somewhat unpredictable, the address does seem to relate to the first name table tile to be fetched for the next scanline. This would seem to imply that PPU cc #256 is when the PPU's scroll/address counters have their horizontal scroll values automatically updated.
+
+ It should also be noted that because this fetch is required for objects on the next scanline, it is neccessary for a garbage scanline to exist prior to the very first scanline to be actually rendered, so that object attribute RAM entries can be evaluated, and the appropriate bitmap data retrieved.
+
+ As far as the wasted fetch phases here, well, what can I say. Either Nintendo's engineers were VERY lazy, and didn't want to add the small amount of extra circuitry to the PPU so that 16 object fetches could take place per scanline, or Nintendo couldn't spot the extra memory required to implement 16 object scanlines. Thing is though- between the object attribute mem, sprite temporary & buffer mem, and palette mem, that's already 2406 bits of RAM; I don't think it would've killed them to just add the 408 bits it would've took for an extra 8 objects, which would've made games with horrible OAM cycling (Double Dragon 2 w/ 2 players) look half-decent (hell, with 16 object scanlines, games would hardly even need OAM cycling).
+
+
+ Details of object pattern fetch & render
+ ----------------------------------------
+ Where the PPU fetches pattern table data for an individual object is conditioned on the contents of the sprite temporary memory element, and $2000.5. If $2000.5 = 0, the tile index data is used as usual, and $2000.3 selects the pattern table to use. If $2000.5 = 1, the MSB of the range result value become the LSB of the indexed tile, and the LSB of the tile index value determines pattern table selection. The lower 3 bits of the range result value are always used as the fine vertical offset into the selected pattern.
+
+ Horizontal inversion (bit order reversing) is applied to fetched bitmaps, if indicated in the sprite temporary memory element.
+
+ The fetched pattern table data (which is 2 bytes), plus the associated 3 attribute bits (palette select & priority), and the x coordinate byte in sprite temporary memory are then loaded into a part of the PPU called the "sprite buffer memory" (the primary object present bit is also copied). This memory area again, is large enough to hold the contents for 8 sprites.
+
+ The composition of one sprite buffer element here is: 2 8-bit shift registers (the fetched pattern table data is loaded in here, where it will be serialized at the appropriate time), a 3-bit latch (which holds the color & priority data for an object), and an 8-bit down counter (this is where the x coordinate is loaded).
+
+ The counter is decremented every time the PPU renders a pixel (the first 256 cc's of a scanline; see "Memory fetch phase 1 thru 128" above). When the counter equals 0, the pattern table data in the shift registers will start to serialize (1 shift per clock). Before this time, or 8 clocks after, consider the outputs of the serializers for each stage to be 0 (transparency).
+
+ The streams of all 8 object serializers are prioritized, and ultimately only one stream (with palette select & priority information) is selected for output to the multiplexer (where object & playfield pixels are prioritized).
+
+ The data for the first sprite buffer entry (including the primary object present flag) has the first chance to enter the multiplexer, if it's output pixel is non-transparent (non-zero). Otherwise, priority is passed to the next serializer in the sprite buffer memory, and the test for non-transparency is made again (the primary object present status will always be passed to the multiplexer as false in this case). This is done until the last (8th) stage is reached, when the object data is passed through unconditionally. Keep in mind that this whole process occurs every clock cycle (hardware is used to determine priority instantly).
+
+ The multiplexer does 2 things: determines primary object collisions, and decides which pixel data to pass through to index the palette RAM- either the playfield's or the object's.
+
+ Primary object collisions occur when a non-transparent playfield pixel coincides with a non-transparent object pixel, while the primary object present status entering the multiplexer for the current clock cycle is true. This causes a flip-flop ($2002.6) to be set, and remains set (presumably) some time after the VINT occurence (prehaps up until scanline 20?).
+
+ The decision for selecting the data to pass through to the palette index is made rather easilly. The condition to use object (opposed to playfield) data is:
+
+ (OBJpri=foreground OR PFpixel=xparent) AND OBJpixel<>xparent
+
+ Since the PPU has 2 palettes; one for objects, and one for playfield, the appropriate palette will be selected depending on which pixel data is passed through.
+
+ After the palette look-up, the operation of events follows the aforementioned steps in the "video signal generation" section.
+
+
+ Memory fetch phase 161 thru 168
+ -------------------------------
+ 1. Name table byte
+ 2. Attribute table byte
+ 3. Pattern table bitmap #0 (for next scanline)
+ 4. Pattern table bitmap #1 (for next scanline)
+
+ This process is repeated 2 times.
+
+
+ It is during this time that the PPU fetches the appliciable playfield data for the first and second tiles to be rendered on the screen for the *next* scanline. These fetches initialize the internal playfield pixel pipelines (2- 16-bit shift registers) with valid bitmap data. The rest of tiles (3..32) are fetched at the beginning of the following scanline.
+
+
+ Memory fetch phase 169 thru 170
+ -------------------------------
+ 1. Name table byte
+ 2. Name table byte
+
+
+ I'm unclear of the reason why this particular access to memory is made. The name table address that is accessed 2 times in a row here, is also the same nametable address that points to the 3rd tile to be rendered on the screen (or basically, the first name table address that will be accessed when the PPU is fetching playfield data on the next scanline).
+
+
+ After memory access 170
+ -----------------------
+ The PPU simply rests for 1 cycle here (or the equivelant of half a memory access cycle) before repeating the whole pixel/scanline rendering process.
+
+
++------------------+
+|Extra cycle frames|
++------------------+
+ Scanline 20 is the only scanline that has variable length. On every odd frame, this scanline is only 340 cycles (the dead cycle at the end is removed). This is done to cause a shift in the NTSC colorburst phase.
+
+ You see, a 3.58 MHz signal, the NTSC colorburst, is required to be modulated into a luminance carrying signal in order for color to be generated on an NTSC monitor. Since the PPU's video out consists of basically square waves (as opposed to sine waves, which would be preferred), it takes an entire colorburst cycle (1/3.58 MHz) for an NTSC monitor to identify the color of a PPU pixel accurately.
+
+ But now you remember that the PPU renders pixels at 5.37 MHz- 1.5x the rate of the colorburst. This means that if a single pixel resides on a scanline with a color different to those surrounding it, the pixel will probably be misrepresented on the screen, sometimes appearing faintly.
+
+ Well, to somewhat fix this problem, they added this extra pixel into every odd frame (shifting the colorburst phase over a bit), and changing the way the monitor interprets isolated colored pixels each frame. This is why when you play games with detailed background graphics, the background seems to flicker a bit. Once you start scrolling the screen however, it seems as if some pixels become invisible; this is how stationary PPU images would look without this cycle removed from odd frames.
+
+ Certain scroll rates expose this NTSC PPU color caveat regardless of the toggling phase shift. Some of Zelda 2's dungeon backgrounds are a good place to see this effect.
+
+
++---------------------------+
+|The MMC3's scanline counter|
++---------------------------+
+ As most people know, the MMC3 bases it's scanline counter on PPU address line A13 (which is why IRQ's can be fired off manually by toggling A13 a bunch of times via $2006). What's not common knowledge is the number of times A13 is expected to toggle in a scanline (although if you've been paying close attention to the doc here, you should already know ;)
+
+ A13 was probably used for the IRQ counter (as opposed to using the PPU's /READ line) because this address line already needed to be connected to the MMC for bankswitching purposes (so in other words, to reduce the MMC3's pin count by 1). They also probably used this method of counting (as opposed to a CPU cycle counter) since A13 cycles (0 -> 1) exactly 42 times per scanline, whereas the CPU count of cycles per scanline is not an exact integer (113.67). Having said that, I guess Nintendo wanted to provide an "easy-to-use" method of generating special image effects, without making programmers have to figure out how many clock cycles to program an IRQ counter with (a pretty lame excuse for not providing an IRQ counter with CPU clock cycle precision (which would have been more useful and versatile)).
+
+ Regardless of any values PPU registers are programmed with, A13 will operate in a predictable fashion during image rendering (and if you understand how PPU addressing works, you should understand that A13 is the *only* address line with fixed behaviour during image rendering).
+
+
++------------------------+
+|PPU pixel priority quirk|
++------------------------+
+ Object data is prioritized between itself, then prioritized between the playfield. There are some odd side effects to this scheme of rendering, however. For instance, imagine a low priority object pixel with foreground priority, a high priority object pixel with background priority, and a playfield pixel all coinciding (all non-transparent).
+
+ Ideally, the playfield is considered to be the middle layer between background and foreground priority objects. This means that the playfield pixel should hide the background priority object pixel (regardless of object priority), and the foreground priority object should appear atop the PF pixel.
+
+ However, because of the way the PPU renders (as just described), OBJ priority is evaluated first, and therefore the background object pixel wins, which means that you'll only be seeing the PF pixel after this mess.
+
+ A good game to demonstrate this behaviour is Megaman 2. Go into airman's stage. First, jump into the energy bar, just to confirm that megaman's sprite is of a higher priority than the energy bar's. Now, get to the second half of the stage, where the clouds cover the energy bar. The energy bar will be ontop of the clouds, but megaman will be behind them. Now, look what happens when you jump into the energy bar here... you see the clouds where megaman underlaps the energy bar.
+
+
++----------------------+
+|Graphical enhancements|
++----------------------+
+ Since an NES cartridge has access to the PPU bus, any number of on-cart hardware schemes can be used to enhance the graphic capabilities of the NES. After all, the PPU's playfield pipeline is very simple: it fetches 272 playfield pixels per scanline (as 34*2 byte fetches, in real-time), and outputs 256 of them to the screen (with the 0..7 pixel offset determined by the fine X scroll register), along with object data combined with it.
+
+ Essentially, you can bypass the PPU's simple scrolling system, implement a custom one on your cart (fetching bitmap data in your own fashion), and feed the PPU bitmap data in your own order.
+
+ The possibilities of this are endless (like sporting multiple playfields, or even playfield rotation/scaling), but of course what it comes down to is the amount of cartridge hardware required.
+
+ Generally, playfield rotation/scaling can be done quite easily- it only requires a few sets of 16-bit registers and adders (the 16 bits are broken up into 8.8 fixed point values). But this kind of implementation is more suited for an integrated circuit, since this would require dozens of discrete logic chips.
+
+ Multiple playfields are another thing which could be easily done. The caveat here is that pixel pipelines (i.e., shift registers) and a multiplexer would have to be implemented on the cart (not to mention exclusive name table RAM) in order to process the playfield bitmaps from multiple sources. The access to the CHR-ROM/RAM would also have to increased- but as it stands, the CHR-ROM/RAM bandwidth is 1.34 MHz, a rather low frequency. With a memory device capable of a 10.74 MHz bandwith, you could have 8 playfields to work with. Generally, this would be very useful for displaying multiple huge objects on the screen- without ever having to worry about annoying flicker.
+
+ The only restriction to doing any of this is that:
+
+ - every 8 sequential horizontal pixels sent to the PPU must share the same palette select value. Because of this, hardware would have to be implemented to decide which palette select value to feed the PPU between 8 horizontally sequential pixels, if they do not all share the same palette select value. The on-screen results of this may not be too flattering sometimes, but this is a small price to pay to do some neat graphical tricks on the NES.
+
+ -only the playfield palette can be used. As usual, this pretty much limits your randomly accessable colors to about 12+1.
+
+ It's a damn shame that Nintendo never created a MMC which would enhance graphics on the NES in useful ways as mentioned above. The MMC5 was the only device that came close, and it's only selling features were the single-tile color area, and the vertical split screen mode (which I don't think any game ever used). Considering the amount of pins (100) the MMC5 had, and number of gates they put in it just for the EXRAM (which was 1K bytes), they could've put some really useful graphics hardware inside there instead.
+
+ Prehaps the infamous Color Dreams "Hellraiser" cart was the closest the NES ever came to seeing such sophisticated graphics. The cart was never released, but from what I've read, it was going to use some sort of frame buffer, and a Z80 CPU to do the graphical rendering. It had been rumored that the game had 3D graphics (or at least 2.5D) in it. If so (and the game was actually good), prehaps it would have raised a few eyebrows in the industry, and inspired Nintendo to develop a new MMC chip with similar capabilities, in order to keep the NES in it's profit margin for another few years (and allow it to compete somewhat with the more advanced systems of the time).
+
+EOF \ No newline at end of file
diff --git a/documentation/tech/ppu/loopy1.txt b/documentation/tech/ppu/loopy1.txt
new file mode 100644
index 00000000..bda6d852
--- /dev/null
+++ b/documentation/tech/ppu/loopy1.txt
@@ -0,0 +1,63 @@
+Subject: [nesdev] the skinny on nes scrolling
+Date: Tue, 13 Apr 1999 16:42:00 -0600
+From: loopy <zxcvzxcv@netzero.net>
+Reply-To: nesdev@onelist.com
+To: nesdev@onelist.com
+
+From: loopy <zxcvzxcv@netzero.net>
+
+---------
+the current information on background scrolling is sufficient for most games;
+however, there are a few that require a more complete understanding.
+
+here are the related registers:
+ (v) vram address, a.k.a. 2006 which we all know and love. (16 bits)
+ (t) another temp vram address (16 bits)
+ (you can really call them 15 bits, the last isn't used)
+ (x) tile X offset (3 bits)
+
+the ppu uses the vram address for both reading/writing to vram thru 2007,
+and for fetching nametable data to draw the background. as it's drawing the
+background, it updates the address to point to the nametable data currently
+being drawn. bits 0-11 hold the nametable address (-$2000). bits 12-14 are
+the tile Y offset.
+
+---------
+stuff that affects register contents:
+(sorry for the shorthand logic but i think it's easier to see this way)
+
+2000 write:
+ t:0000110000000000=d:00000011
+2005 first write:
+ t:0000000000011111=d:11111000
+ x=d:00000111
+2005 second write:
+ t:0000001111100000=d:11111000
+ t:0111000000000000=d:00000111
+2006 first write:
+ t:0011111100000000=d:00111111
+ t:1100000000000000=0
+2006 second write:
+ t:0000000011111111=d:11111111
+ v=t
+scanline start (if background and sprites are enabled):
+ v:0000010000011111=t:0000010000011111
+frame start (line 0) (if background and sprites are enabled):
+ v=t
+
+note! 2005 and 2006 share the toggle that selects between first/second
+writes. reading 2002 will clear it.
+
+note! all of this info agrees with the tests i've run on a real nes. BUT
+if there's something you don't agree with, please let me know so i can verify
+it.
+
+________________________________________________________
+NetZero - We believe in a FREE Internet. Shouldn't you?
+Get your FREE Internet Access and Email at
+http://www.netzero.net/download.html
+
+------------------------------------------------------------------------
+New hobbies? New curiosities? New enthusiasms?
+http://www.ONElist.com
+Sign up for a new e-mail list today!
diff --git a/documentation/tech/ppu/loopy2.txt b/documentation/tech/ppu/loopy2.txt
new file mode 100644
index 00000000..7a4585e1
--- /dev/null
+++ b/documentation/tech/ppu/loopy2.txt
@@ -0,0 +1,33 @@
+Subject: [nesdev] Re: the skinny on nes scrolling
+Date: Tue, 13 Apr 1999 17:48:54 -0600
+From: loopy <zxcvzxcv@netzero.net>
+Reply-To: nesdev@onelist.com
+To: nesdev@onelist.com
+
+From: loopy <zxcvzxcv@netzero.net>
+
+(more notes on ppu logic)
+
+you can think of bits 0,1,2,3,4 of the vram address as the "x scroll"(*8)
+that the ppu increments as it draws. as it wraps from 31 to 0, bit 10 is
+switched. you should see how this causes horizontal wrapping between name
+tables (0,1) and (2,3).
+
+you can think of bits 5,6,7,8,9 as the "y scroll"(*8). this functions
+slightly different from the X. it wraps to 0 and bit 11 is switched when
+it's incremented from _29_ instead of 31. there are some odd side effects
+from this.. if you manually set the value above 29 (from either 2005 or
+2006), the wrapping from 29 obviously won't happen, and attrib data will be
+used as name table data. the "y scroll" still wraps to 0 from 31, but
+without switching bit 11. this explains why writing 240+ to 'Y' in 2005
+appeared as a negative scroll value.
+
+________________________________________________________
+NetZero - We believe in a FREE Internet. Shouldn't you?
+Get your FREE Internet Access and Email at
+http://www.netzero.net/download.html
+
+------------------------------------------------------------------------
+Looking for a new hobby? Want to make a new friend?
+http://www.ONElist.com
+Come join one of the 115,000 e-mail communities at ONElist!
diff --git a/documentation/tech/readme.now b/documentation/tech/readme.now
new file mode 100644
index 00000000..4575e0c6
--- /dev/null
+++ b/documentation/tech/readme.now
@@ -0,0 +1,6 @@
+Many(possibly all) of these documents contain flaws or are incomplete, so
+don't pull out your hair if there are inconsistencies between the documents,
+what's in FCE Ultra, and what you observe. That's not to say that FCE Ultra
+doesn't have its share of (emulation) flaws, though...
+
+For many more NES-related documents, try http://nesdev.parodius.com
diff --git a/documentation/tech/readme.sound b/documentation/tech/readme.sound
new file mode 100644
index 00000000..cce95040
--- /dev/null
+++ b/documentation/tech/readme.sound
@@ -0,0 +1,2 @@
+Sound information is in the "cpu" subdirectory, due to the intimate
+relationship between the sound circuitry and the cpu.
diff --git a/documentation/todo b/documentation/todo
new file mode 100644
index 00000000..5a1131dd
--- /dev/null
+++ b/documentation/todo
@@ -0,0 +1,70 @@
+The following games are broken to some extent:
+
+Crystalis: Mostly working, but the screen jumps around during
+ dialogue. It apparently resets the MMC3 IRQ counter
+ mid-scanline. It'll require low-level PPU and MMC3
+ IRQ counter emulation to function properly.
+Kyoro Chan Land: Expects a sprite hit to happen, but it has sprite 0 over
+ transparent background.
+
+*** First, things that are not on the TODO list(Don't bug me about these
+ things if you're an idiot. I don't like listening to idiots.
+ If you are not an idiot, and you can make decent arguments for why
+ these should be on the TODO list, then you can bug me.).
+
+*** General Features:
+
+ Remappable command keys(to multiple keys on the keyboard and a joystick).
+
+ Fix possible UNIF crashes(if no PRGx or CHRx chunks exist, it may crash,
+ due to changes made in 0.92).
+
+ Windows Port:
+ Support for command-line options(so that one crazy guy will quit bugging
+ me).
+
+ SDL Port:
+ Hotkey remap GUI
+
+ Figure out a good way to add "turbo" button support and then do it.
+
+ Make default svgalib video mode a non-tweaked VGA mode.
+
+ Finish the software video blitting "library", add support for 2xsai, eagle,
+ interpolation, etc. effects.
+
+
+*** Emulation:
+
+
+ ***IMPORTANT***
+ If anyone ever cares to implement movie recording/playback, we must figure
+ out what to do with some unsaved variables, like timestamp and timestampbase.
+ These variables are abused in the sound emulation code, and modifying them
+ in certain ways elsewhere can cause crashes.
+ ***IMPORTANT***
+
+ Implement cart-based expansion devices, and interfaces for them(dip switches
+ and that Datach barcode reader, and maybe others).
+
+ Fix DPCM playback and IRQ at end of playback.
+
+ Fix some 6502 emulation bugs(undocumented opcodes might not be implemented
+ correctly and I'm not sure if the IRQ flag latency is implemented correctly).
+
+ Implement more dummy CPU reads when in debug mode.
+
+ Fix MMC3 IRQ emulation.
+
+ Figure out correct timing for when the PPU refresh address register is
+ updated by the PPU(for the next scanline).
+
+ Sound frame count stuff on PAL games(is it correct?).
+
+ Fix FDS sound emulation.
+
+ Fix NMI timing and D7 of 2002 setting timing. Fixing this might require
+ a small hack. Also be aware that this might break Battletoads, particularly
+ during the second level.
+
+ Fix Zapper emulation(Chiller still doesn't always work correctly).