Speeding up VGA emulation?

Anyone got any tips regarding speeding up VGA emulation? (I'm currently processing the pixels, one at the time, from VGA VRAM to the emulated screen. The rendering happens after every frame) Currently at avg. of 50 microseconds per pixel (text mode), 50 milliseconds per line (int10 mode 04h).
Last edited on
50 ms per scanline is pretty dreadfully slow. But it's hard to say how to improve it without knowing what you're doing.
Got the rendering speed down to ~40ms per row with ~1FPS. Got any tips on optimizing it? (Are references to memory (byte pointer) with [] slow? It seems it takes up about 10us for this. Is there another way to do this? Simple byte+index dereference vs []?)

Atm:
8/9us is spend in the sequencer root routine (handling all pixels one by one).
22us is spend with the root routine, with only pixel basics on.
18us is spend with the root routine, with only the attribute controller on.
28/29us is spend with the renderer fully rendering (so sequencer, characters on, attribute controller on).

The DAC (converting values through the DAC color registers) is always on in all above cases.

The sequencer is essentially the main routine, handling scanlines pixel by pixel.
The character controller handles text mode operations, reading and rendering character pixels, the other option being the graphics controller (which handles graphics modes).
The attribute controller handles the pixels gotten from the character or graphics controller, processing them to DAC values, during active rendering only (not used when the output signal is supposed to be shut down, in which case the above isn't used (blank screen)).
Last edited on
@Disch
He's doing a hardcore emulation of VGA hardware.

@superfury
As an emulator, you don't need to actually do everything the VGA hardware does at the same time the hardware does it.

In other words, specialize, specialize, specialize.

Offload processing for things that the current configuration doesn't need or use to other things.


For example, the complete DAC doesn't need to be emulated while the user process is running. Emulate only the current configuration. Use lookup tables for everything.

Only when a register is modified do you need to take the time to make changes. Swap out the current DAC routine for another.

Hopefully, the slowest thing you'll need to do is colormap rotations.
The DAC is pretty fast I think (already using lookup tables for those, to convert DAC values to RGB values). The renderer can run at max 30FPS afaik (using SDL, with no time taken by the VGA simply rendering about ~30-50ms per frame rendered).

The pixels are processed one by one (so 0,0 then 1,0 etc (depending on the video mode this takes a different time (attribute controller takes about 8us, text renderer (from VRAM to pixel on/off+attribute) takes about 20ms, graphics untested atm, with most time spend by accessing the VRAM plane 2 for fonts)), with checking for overflow in between, till the end of the screen called by a timer thread). During text mode, the characters are buffered one line at the time (8 pixels horizontally). When a different character or row is requested, data is reload from VRAM (using a static variable in the meantime to return data (32-bit uint_32 containing character(8-bits), attribute(8-bits), row (8-bits) and flag (1-bit) determing saved state. This is afaik the slowest function during text mode.

This takes about 15us per pixel:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
inline byte getcharxy(VGA_Type *VGA, byte attribute, byte character, int x, int y) //Retrieve a characters x,y pixel on/off from table!
{
	if (!VGA) //No active VGA?
	{
		raiseError("VGA","VGA getcharxy, but no VGA loaded!");
		return 0; //Nothing!
	}

	if (getcharacterwidth(VGA)!=8) //What width? 9 wide?
	{
		if (x&0xFFF8) //Extra ninth bit?
		{
			if (VGA->registers->AttributeControllerRegisters.REGISTERS.ATTRIBUTEMODECONTROLREGISTER.LineGraphicsEnable && ((character&0xE0)==0xC0)) //Replicate the 8th one (C0-DF)? (Highest 3 bits are 0x6)
			{ //9th bit becomes background color!
				return 0; //Background color for 9th pixel!
			}
			--x; //Take bit #8!
		}
	}

	uint_32 characterset_offset; //First, the character set, later translated to the real charset offset!
	if (attribute&0x8) //Charset A?
	{
		characterset_offset = (VGA->registers->SequencerRegisters.REGISTERS.CHARACTERMAPSELECTREGISTER.CharacterSetASelect_high<<2)|VGA->registers->SequencerRegisters.REGISTERS.CHARACTERMAPSELECTREGISTER.CharacterSetASelect_low; //Charset A!
	}
	else //Charset B?
	{
		characterset_offset = (VGA->registers->SequencerRegisters.REGISTERS.CHARACTERMAPSELECTREGISTER.CharacterSetBSelect_high<<2)|VGA->registers->SequencerRegisters.REGISTERS.CHARACTERMAPSELECTREGISTER.CharacterSetBSelect_low; //Charset B!
	}

	word characterset_startoffset[8] = {0x0000,0x4000,0x8000,0xC000,0x2000,0x6000,0xA000,0xE000}; //The offsets to use with the above registers!
	characterset_offset = characterset_startoffset[characterset_offset]; //Start offset!

	characterset_offset += OPTMUL32(character,getcharacterheight(VGA)); //Start adress of character!
	characterset_offset += SAFEMODUINT32(y,getcharacterheight(VGA)); //1 byte per row!
		
	return ((readVRAMplane(VGA,0,2,characterset_offset,4)>>(x&7))&1); //Give bit from the VRAM row!
}
Last edited on
Text mode rendering routine:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
inline void VGA_Sequencer_TextMode(VGA_Type *Sequencer_VGA,word Sequencer_Scanline, word Sequencer_x, word Sequencer_tempx, word Sequencer_tempy, byte Sequencer_bytepanning, VGA_AttributeInfo *Sequencer_attributeinfo) //Used to be: VGA_Type *VGA,word Scanline, word x, word tempx, word tempy, byte bytepanning, VGA_AttributeInfo *attributeinfo
{
	//Last row info!
	static word last_tempy; //Last Y position!
	static uint_32 last_chary; //Last character Y position!
	static uint_32 last_charheight; //Last character height!
	static uint_32 last_charystart; //Last start of a row!
	static byte gotlast_y = 0; //Got last to process?
	byte y_updated = 0; //Y updated (combine with checking charx)?
	//Last column&character info!
	static uint_32 last_charx; //Last character X position!
	static uint_32 last_charwidth; //Last character width!
	static byte last_bytepanning; //Last byte panning!
	static byte last_character; //Last character on that position in the VRAM plane!
	static byte last_attribute; //Last attribute on that position in the VRAM plane!
	static byte gotlast_character = 0; //Got last to process?
	static uint_32 Sequencer_textmode_charindex; //Charindex within VRAM plane 0&1!

//First: character info!
	byte charheight = getcharacterheight(Sequencer_VGA);

	Sequencer_attributeinfo->chary = Sequencer_Scanline; //Y of current character!
	Sequencer_attributeinfo->charinner_y = 0; //Set the character inner Y we need!
	if (!gotlast_character || !gotlast_y || !(gotlast_y && last_charheight==charheight && last_tempy==Sequencer_tempy)) //Different character row?
	{
		uint_32 last_charysaved = last_chary; //For checking if we're a different character line!
		last_chary = Sequencer_tempy; //Last temp Y saving for next need!
		last_charheight = charheight; //Last charheight!
		last_charystart = getVRAMScanlineStart(Sequencer_VGA,Sequencer_attributeinfo->chary); //Calculate row start!
		y_updated = (last_charysaved!=last_chary); //Row changed?
		gotlast_y = 1; //We've got the last Y and have been updated!
	} //This works!

	//tempx is always different, we can safely assume!
	byte charwidth = getcharacterwidth(Sequencer_VGA); //Character width!
	byte charx = Sequencer_attributeinfo->charx = OPTDIV(Sequencer_tempx,charwidth); //X of current character!
	byte charinnerx = Sequencer_attributeinfo->charinner_x = OPTMOD(Sequencer_tempx,charwidth); //Current pixel within the ScanLine!

	if (y_updated || !(gotlast_character && last_charwidth==charwidth && last_charx==charx && last_bytepanning==Sequencer_bytepanning)) //Not the same character or Y updated?
	{
		y_updated = 0; //Return to the default: Y isn't updated anymore!
		Sequencer_textmode_charindex = last_charystart; //Get the start of the row!
		Sequencer_attributeinfo->charx = last_charx = charx; //Last charx!
		Sequencer_textmode_charindex += charx; //Add the character column for the base character index!
		last_bytepanning = Sequencer_bytepanning; //Last bytepanning!
		Sequencer_textmode_charindex += Sequencer_bytepanning; //Apply byte panning to the index!
		last_charwidth = charwidth; //Update last characterwidth!
		word vramstart = Sequencer_VGA->precalcs.startaddress; //Start address of VRAM display!
		last_character = readVRAMplane(Sequencer_VGA,vramstart,0,Sequencer_textmode_charindex,2); //The character itself! From plane 0!
		last_attribute = Sequencer_attributeinfo->attribute = readVRAMplane(Sequencer_VGA,vramstart,1,Sequencer_textmode_charindex,2); //The attribute itself! From plane 1!
		gotlast_character = 1; //We've got the last character data!
	}
	
	Sequencer_attributeinfo->attribute_graphics = 0; //We're a text-mode attribute, needed for the attribute controller!
	byte y2=Sequencer_VGA->LinesToRender; //The ammount to render!
	if (!y2) return; //Abort when nothing to render!
	for (;;) //Process all lines to render!
	{
		--y2; //Next pixel!
		byte pixel = getcharxy(Sequencer_VGA,last_attribute,last_character,charinnerx,y2); //Check for the character, the simple way!
		if (!pixel) //Not already on?
		{
			pixel = is_cursorscanline(Sequencer_VGA,Sequencer_x,y2,Sequencer_textmode_charindex,Sequencer_attributeinfo); //Get if we're to plot font, include cursor? (Else back) Used to be: VGA,attributeinfo->charinner_y,charindex
		}
		Sequencer_VGA->CurrentScanLine[y2][Sequencer_x] = pixel; //Set the pixel to use!
		if (!y2) return; //Stop searching when done!
	}
}
Last edited on
Topic archived. No new replies allowed.