User:DanClemmensen/Archive 2008-02-22
From JopWiki
This page started as a copy of my log, crudely wikified from my original HTML document. The format is therefore crude. the document was originally a note from me to myself, and is therefore not very coherent. I am using this page as my active log, so ti will change from day to day, sometimes in incoherent and inconsistent ways. I will periodically archive the page and start over.
Contents |
[edit] JOP Project, 2008-02-22
I became interested in JOP and FPGAs, and started my project in 2007-12. History from then until 2008-02-22 is archived Here.
the Java Optimized Processor is a soft-core implementation of a JVM. It has been implemented on multiple FPGAs, including the Xilinx Spartan 3 family.
[edit] Status as of 2008-02-22
- ISE Webpack: installed and working
- GHDL and GTKView installed and working
- Spartan 3E kit: cabled up and programmable via USB
- DDR SDRAM driver: working in standalone test
- JOP project installed and working on Linux
- JOP partially adapted adapted for Spartan 3E
- makefiles working
- "blink" runs in sim (JOPM microcode minimal demo)
- "blink" runs on target
- jvm.asm reduced to below 1K words
- SPI and HID primitive peripherals written
The next major goal is to get JOP running on the target. To do this, I must:
- Write "blink" in bytecode (done)
- integrate the bytecode into the FPGA load (done)
- adapt Jasmin to handle jopsys extensions (done)
- write a method extractor to extract the method from the classfile
- create the proper Xblockgen template.
- adjust the makefile
- Debug "blink" (done)
- Convert to Java rather than Jasmin. (done.)
- Test the HID peripheral (includes the LCD. I need this for debugging)
- Test the SPI peripheral (included the platform flash reader.)
- Write a bootloader in bytecode
- Write a trivial test app ("hello jop world")
- add the bytecode application to the PROM image
- integrate the DDR SDRAM interface
- write a fast SPI peripheral.
Later major goals:
- integrate the Ethernet MAC
- write the "OS" application
- write the STM application
- Development environment enhancements
- Eclipse
- batch IMPACT
- VHDL2SVG "schematic" documentation
- JOP enhancements
- remove offtbl
- remove jtbl
- shorten pipeline?
[edit] JSTM Architecture
The JSTM is an STM controller based on JOP and running on a Spartan 3E starter kit. The interface to the STM hardware will be via the ADCs and DACs on the board and on some digital signals to drive the stepper motor for coarse positioning. The interface to the rest of the world is via TCP/IP over ethernet.
To support this, we will first get a standalone minimal JOP running. We will then add additional modules until we have a full-up JOP system. Finally we will add the STM.
If the ethernet is too hard to implement, we will revert to USB, but this is a multi-step process because it means switching to Xup. If we switch to Xup, we will first need a Xup flash programmer (or something) to permit easy switching of USB between programming and normal use.
- define off-board interface.
- design off-board STM circuit.
- build off-board STM circuit and connectors.
- create the FPGA drivers for the STM.
- wire up the STM
- write the JAVA to run the STM and send frames to Linux
- design a custom STM board to replace the devel board.
[edit] JOP Core
It is difficult to test the JOP core and HID separately, so I will first study the JOP core again and then design the HID by example from other JOP peripherals and from the Xilinx startup reference design.
Analyze the Spartan 3 project file list to decide what files I need. These files are used in the Spartan 3 project:
| File | purpose</td | disp |
| top\jop_config_xs3.vhd | ? | |
| vhdl\core\jop_types.vhd | package type definitions. Apparently, a set of standard signal names? | keep |
| vhdl\simpcon\sc_pack.vhd | package for SimpCon definitions. Apparently, a set of standard SimpCon signal names? | keep |
| vhdl\scio\fifo.vhd | define a FIFO IO device. used in sc_uart.vfd? | |
| vhdl\scio\sc_uart.vhd | UART. SimpCon. | |
| vhdl\scio\sc_sys.vhd | counter/timer, timer interupt, wd, and general interrupt handler. SimpCon. | keep |
| vhdl\scio\scio_min.vhd | minimal IO suite. ctr, wd. SimpCon. Used for code cownload? | keep |
| vhdl\core\extension.vhd | core's interface to memory, multiplier and IO MUX for din from stack | keep |
| vhdl\core\bcfetch.vhd | bytecode fetch and address translation | keep |
| vhdl\core\fetch.vhd | instruction fetch and branch (formerly included bcfetch) | keep |
| vhdl\core\shift.vhd | barrel shifter | keep |
| vhdl\core\cache.vhd | Bytecode caching (method cache?) | keep |
| vhdl\xilinx\xs3_jbc.vhd | bytecode memory/cache for JOP( Version for Xilinx Spartan-3revdep-re | keep (modify?) |
| vhdl\memory\mem_sc.vhd | External memory interface with SimpCon. Translates between JOP/extension memory interface and SimpCon memory interface. |
keep |
| vhdl\memory\sc_sram32.vhd | SimpCon compliant external memory interface for 32-bit SRAM (e.g. Cyclone board, Spartan-3 Starter Kit) | replace with DDR16 |
| vhdl\core\stack.vhd | Stack/Alu | keep |
| vhdl\core\mul.vhd | booth multiplier (32x32 serial) | keep |
| vhdl\core\core.vhd | cpu core, stack, pc connections (executes microcode?) | keep |
| vhdl\core\decode.vhd | microcode decoder. generate control for pc and stack | keep |
| vhdl\jtbl.vhd | (generated by jopa. bytecodes offsets into JOPM) | |
| vhdl\offtbl.vhd | (generated by jopa. microcode relative jump table) | |
| vhdl\rom.vhd | microcode. | |
| vhdl\xilinx\xram.vhd | internal memory for JOP3 modified for Xilinx ISE to use Block SelectRAM+ | keep (modify?) |
| vhdl\xram_block.vhd | ||
| vhdl\core\jopcpu.vhd | The JOP CPU. (apparently a wrapper for several of the other core files.) | keep |
| vhdl\top\jop_xs3.vhd | top level for Spartan-3 Starter Kit | keep (modify) |
[edit] JOPM Optimization
The JOP microcode uses a 10-bit word. The Spartan 3E FPGA uses a BRAM with a hardware size of 1,2,4,8,9,16,18,32,or 36 bits. The 10-bit word is just about the worst possible size.
Of the ten bits, two of the bits are dedicated: nxt and opd. These bits have several interesting characteristics: they are never both set at the same time, and neither bit is set for a BZ or BNZ.
On a different note, we can add 8 bits to every instruction. If we can get the pipeline to not stall, we use the bits as branch bits. by allocating location 0 as "nxt" we get one more bit. This eliminates the offtbl.
To eliminate the jtbl, Java bytecodes will immediately jump to their own offsets (0-255) in the JOPM. each instruction will have a "next" addr, which is either zero or the next relative instruction. the next relative instruction is either in the next page or the previous page, where a "page" is 256 instruction words: there are a total of 4 pages of microcode (1024 instructions.) A bz/bnz is different: it branches to any pair of locations. The instruction branches to the even word if the pair of the condition is zero and to the odd location if the condition is non-zero. This means that the current 64 branch instructions are reduced to 1 , and we have 63 extra instructions to use. If this reduces the instruction count below 128, we may recover an additional bit, which extends the jump address to ten bits and removes the restriction.
[edit] Bytecode Bootloader
I removed the bootloader from the JOPM to reduce the number of words to below 1024. I must now add a bootloader somewhere, either as a separate VHDL state machine or in bytecode. Bytecode is "free" on the Spartan 3E, because the amount of program flash used for the FPGA is fixed no matter what the FPGA configuration is. therefore, we can initialize the method cache BRAM at no cost. We place a boot loader in the method cache, but we will probably need a very small init sequence in the JOPM as well, to get the stack and jpc set up.
The Boot loader must be written in "jbc--": the subset of the java bytecodes that is impleneted in the JOPM. Bytecodes that are themselves implemented in Java cannot be used. We therefore need an assembler for this language. The standard jbc assembler is Jasmin, which emits a classfile. Our alternatives are to either write a really primitive assembler, or use Jasmin and write a really primitive classfile decoder. We can use XBlockGen to create the initialized method cache in either case.
Analysis of the existing boot loader shows that it performs the following steps:
- stack pointer <=stack_init
- if CPU 0:
- moncnt <=1
- load word 0, word 1, and data into RAM
- word 0 is length of load
- word 1 is method pointer
- remaining words are data, not analyzed by loader
- heap<= word 0
- mp <= word 1
- jjp<= (mp+1)
- jjhp<=(mp+2)
- invoke mp
We still need to analyze how to cause the JOPM to execute a method that is pre-loaded into the method cache. Basically, we need to execute a "nxt" with the jpc pointing to the correct bytecode and with a valid stack?
Write a "Blink" program in jbc-- and run it!
[edit] Bytecode Blink
As a preliminary step to bytcode bootload, we must be able to assemble the bytecode and produce the BRAM image for the method cache. This is a three-step process:
- Assemble a jasm file to produce a classfile
- extract the method from the classfile into a .dat file
- Create the Xilinx BRAM .vhd file
Assemble: modify Jasmin to accept the jopsys extension bytecodes
Extract: wrtie a new tool in Java using the BCEL library
Dat-to-BRAM: new template for XblockGen, and upgrade XBlockGen to handle the different aspect.
[edit] Batch IMPACT
Impact failed again, due to TCL incompatabilities, I think. I will now research the batch mode. apparently, the easiest way to produce batch command files is to run impact and then save the emitted internal batch file, whose name is _impact.cmd. and use it as an example.
For some reason, impact was actually working but was putting the file in the parent directory.
[edit] Java Bootloader
Maybe write the bootloader in Java? Rules:
- one method
- may call Native.xxx, but nothing else. These will be "JOPized" into native bytecodes.
- must not use constructs that compile to "trapped" bytecodes.
- Use of constants, statics, and locals must be investigated.
- Is Startup.java a worked example?
- Should Startup.java become part of boot?
JOPizer is too tightly connected the rest of the JOP build system for me to understand, so I will build a standalone program named ExtractBoot, based on ExtractMethod.
The class will include definitions of native methods in addition to the main method. We use BCEL for the hard work. We parse the class, then find the main method and find each of its instructions. We replace each invocation of a native method with the appropritae native bytecode, then call BCEL once more to recompute the jumps. Finally, we emit the result.
I am unclear on constants and variables.
Preliminary code is now tested and working, but constants are a problem -- how big a problem I do not know.
Next steps:
- analyze JOPM and categorize all bytecodes
- unrestricted
- memory access
- not implemented
- add a table and checker in ExtractBoot
If the problem is only with the ldc instruction, then ExtractBoot can convert each ldc into a sequence:
- sipush hi
- bipush 16
- ishl
- sipush lo
- ior
[edit] Emitting Annotated Boot Bytecode
For debug, ExtractBoot should emit annotated bytecode, with addresses and symbolic instructions. Just to be elegant, the output should be usable by Jasmin, although we probably will not need this feature.
ExtractBoots's first pass results in an array that has an entry for each original instruction, but LCD instructions are replaced by multi-instruction sets.
[edit] LCD Commands
"memory" based.
two rows: 0-0x27 and 0x40-0x67.
unshifted, 0-0x0f and 0x40-0x4f are displayed
Each command is sent as two nybbles. We will write, not read: LCD_RW is always 0.
command occurs when lcd_e is driven hi: write data with lcd_e lo, then hi for at least 12 cycles, then lo.
write hi nybble, wait 50, write lo nybble wait 2000. If cmd, rs=0. if data, rs=1.
CLR: 01
HOME: 02
Cursor mode:
- 10 (mv cursor left)
- 14 (mv cursor right)
- 18 (scroll display left)
- 1c (scroll display right)
SET addr: 0x80|addr
WRITE: xx (30-39 digits, 40-46 A-F)
setup sequence:
- write nybble 3, wait 205,000 cycles
- write nybble 3, wait 5,000 cycles
- write nybble 3, wait 2,000 cycles
- write nybble 2, wait 2,000 cycles
- FUNC SET 0x28
- EM SET 0x06
- DISPLAY 0x0c
- CLR 0x01, wait 82,000 cycles
To display a word:
- HOME
- write 8 nybbles
[edit] IMPACT batch mode
Running IMPACT in GUI mode has become a major irritant, so I must now spend the time to figure out how to run it in batch mode. I use IMPACT for two distinct functions:
- Given an FPGA image in the file main.bit, create a PROM image named main.mcs.
- Given a PROM image named main.mcs, place this image in the xcf04s serial PROM on the target board.
Xilinx' vision of the IMPACT functionality is much more elaborate than this, so it is difficult to determine exactly how to run IMPACT in batch mode to perform these two simple functions. I am making progress, however. Xilinx did document the batch mode for IMPACT 8.x in this document, and the bacth mode seems to continue to work in ISE 9.2i.
I require the two functions to be distinct because I intend to modify the .mcs file to append the JOP application. In addition, I may choose to modify the .bit file after it is generated by the FPGA synthesis: Xilinx provides a tool named data2mem for this purpose: If you already have a fully-synthesized bitfile, and you wish to change only the contents of one or more of the BRAMs, you may use data2mem. Full synthesis takes a few minutes of CPU time, while data2mem takes at most a few seconds. If we can fully automate the two IMPACT steps, we can very easily test new versions of microcode or bytecode bootloader very quickly.
- create the prombuilder batch file
- create the promloader batch file
- create a good bitfile
- validate the bitfile using the traditional method. (to emsure that I have nto damaged the target board's load.) Save the .mcs file
- use the new batch prombuilder script to create the .mcs file. Use diff to compare it to the "traditional .mcs file
- use the new promloader script to download the .mcs fiel to the prom. Observe that it works.
[edit] Log
| Date | Activity |
| 2008-02-22 | archived the old stuff from this document |
| wrote plan for next steps | |
| wrote blink.jasm | |
| added jopsys bytecodes to Jasmin. | |
| compiled blink.jasm | |
| analyzed blink.class | |
| investigated BCEL. JOP already uses it and it is distributed with JOP. | |
| wrote the bytecode extractor | |
| updated makefile. | |
| created template, updated makefile. | |
| 2008-02-23 | updated XblockGen |
| cleanued up Extract | |
| debugged makefile | |
| moved ExtractMethod to tools directory | |
| debugged makefile (classpath hell.) | |
| debugged Xblockgen | |
| built with ghdl (debugged method mam template) | |
| ran in sim. Works, but iconst_1 stacks a zero?? Finally traced to use of an out-of-date generated file that is the Stack RAM init. Stack RAM also contains the microcode consts -- will shift to blockgen. | |
| updated makefile, built template, debugged Xblockgen for 32-bit words, built and ran in sim. It works! | |
| built for target. found and fixed linking errors in new scio build completed. | |
| tried to use Impact GUI to build prom. Failed. Probably the TCL popup thing again. will research the impact batch mode some more. | |
| 2008-02-24 | Impact was sort of working: it placed the output in the parent directory. wierd |
| downloaded to target. it works for perhaps 500 blinks and then stops. probably stack overflow. | |
| began investigating writing boot in Java. need to "JOPize" the result. After several hours, I gave up on using JOPizer directly. I will start from ExtractMethod instead. | |
| completed writing ExtractBoot | |
| 2008-02-25 | struggled with Java static structure init. Finally used a kludge |
| extracted the boot from boot.class | |
| 2008-02-26 | built for the board and downloaded. not blinking, or perhaps blinking too fast. |
| in sim, blinking too fast. | |
| analyzed the code: it uses the constant pool. ExtractBoot does not extract the constant pool, and I do not know where it resides in any case. | |
| added ldc fixup to ExtractBoot | |
| added "illegal" opcode check to ExtractBoot | |
| extracted "blink" and ran it. It works! | |
| 2008-02-27 | adding HID test to boot. |
| 2008-02-28 | compiled HID test. Extractboot fails because I am replacing goto targets. |
| fixed ExtractBoot | |
| compiled and ran HID test. fails. Investigated. DUH! no top-level connections in UCF for the LCD! | |
| 2008-02-29 | Plugged my new USB serial port into my computer. Not supported by my custom Linux, must rebuild Linux. |
| further research on data2mem. this utility is in fact part of the Xilinx Webpack. we can therefore use it to change the BRAM contents if no other VHDL (ro UCF) file has changed. | |
| 2008-03-01 | re-wrote boot.java to use #defines. |
| adjusted UCF to map the HID pins. | |
| added debug output file to ExtractBoot to permit evaluation. | |
| 2008-03-03 | researched batch IMPACT. Found This Xilinx site. |
| Tried to implement the prom-buiild script. It appears to work. | |
| 2008-03-05 | verified batch prom-build |
| Created and debugged batch prom load. We now have a fully-automated Xilinx build system! Back to work on LCD... | |
| 2008-03-06 | LCD is working! I had failed to set the sp and vp properly before starting the Bytecode. Found in sim. |
| LCD code is a pig. 650+ bytes. The entire flash bootloader, with LCD, is less than 900 bytes. Would still fit in the smallest JOP method cache. My method cache is 2048 bytes, so I'm OK. | |
| Flash read didn't work. To debug, added read of board switches and buttons to control display points. This also failed. Found the problem: I had not specified all of the pins in UCF file, so no chance of flash or switches working. | |
| Fixed UCF, got switches working. something is wrong with the disables for teh other SPI devices... after two hours of trying alternatives, I found an inconsistency in the documentation. | |
| 2008-03-07 | Found a worked Picoblaze example of the SPI at the Xilinx site: yes, the documentation is wrong. |
| Changed SPI disables to match the example. Success! we now read from the Platform Flash! (it's bit-swapped, but we can fix that when we build the image.) |
