Topic: FMC151 ref. design in Vivado  (Read 23362 times)

tedjnsn November 04, 2014, 08:29 PM

  • Member
  • *
  • Posts: 47
Hello,


I would like to get  4DSP StellarIP reference design  kc705_fmc151 to work in Vivado 2014.3 and I wonder if I could get some advice in this forum.


So far I've imported the VHDL source, except for Xilinx IP modules, which I added in Vivado by hand, using original settings from reference design . The reasoning was that new IP comes with proper internal constraints for Vivado. I've imported pin-out constraints.  Clocking constraints were created from scratch, including creating several asynchronous derived clock groups.


I was able to generate a bit file and run it through tests with FMC15xAPP. Vivado still complains that there isn't a dedicated clocking  route from IDELAY2 to BUFG as I have mentioned previously here  - http://www.4dsp.com/forum/index.php/topic,2862.0.html. That was confirmed by Xilinx for me.


The design mostly works, bug the ADC ramp pattern test in FMC15xAPP reports 0.25 to 0.5 % error rate. I saved recorded ramp pattern and  there is an  occasional single bit error, typically in the same bit rank. Is it possible that align_machine does not produce optimum results in Vivado? Or could this error come from clock domain crossing?


I wonder if Vivado timing implementation is too optimistic and one needs to specify setup and hold constraints for  the I/O. Is there any documentation on FMC150/151 I/O timing parameters?


Thanks in advance for any advice ion this matter.



lmunoz November 05, 2014, 12:39 PM (#1)

  • Member
  • *
  • Posts: 160
Hi,

The I/O timing would be in the ADC documentation. If you are getting a single bit error I would track down where the error is happening with chipscope (or the Vivado equivalent, hardware manager).  Look at what comes out of the SERDES. Are you meeting timing?  The conversion between ucf and xdc is not always correct and you should manually check that they are correct.


Regards,
Luis
  • « Last Edit: November 05, 2014, 12:51 PM by lmunoz »

tedjnsn November 05, 2014, 06:32 PM (#2)

  • Member
  • *
  • Posts: 47
Luis,


Thanks for looking into this issue.  Is there an easy method to track a bit error that only occurs one out of 500 or so times, at random?


Vivado does not report timing violations except for couple  sub-ns hold violations in amc7823_ctrl on clock domain crossing to SPI clock, but the board seems to report correct voltages and temperature in the app nervelessness.


Here's the timing constraints that I used with reference design in Vivado:


Code: [Select]
create_clock -period 4.069 -name clk_ab_p_0 -waveform {0.000 2.035} [get_ports clk_ab_p_0]
create_clock -period 4.069 -name clk_to_fpga_p_0 -waveform {0.000 2.035} [get_ports clk_to_fpga_p_0]
create_clock -period 8.000 -name phy_rxclk_0 -waveform {0.000 4.000} [get_ports phy_rxclk_0]
create_clock -period 5.000 -name sysclk_p_0 -waveform {0.000 2.500} [get_ports sysclk_p_0]


create_generated_clock -name sip_fmc151_0/fmc151_if_inst/ads62p49_ctrl_inst/sclk_prebuf -source [get_pins sip_mac_engine_0/mac_engine_inst/brd_clocks_inst/pll0_inst/U0/mmcm_adv_inst/CLKOUT1] -divide_by 16 [get_pins sip_fmc151_0/fmc151_if_inst/ads62p49_ctrl_inst/sclk_prebuf_reg/Q]
create_generated_clock -name sip_fmc151_0/fmc151_if_inst/amc7823_ctrl_inst/sclk_prebuf -source [get_pins sip_mac_engine_0/mac_engine_inst/brd_clocks_inst/pll0_inst/U0/mmcm_adv_inst/CLKOUT1] -divide_by 16 [get_pins sip_fmc151_0/fmc151_if_inst/amc7823_ctrl_inst/sclk_prebuf_reg/Q]
create_generated_clock -name sip_fmc151_0/fmc151_if_inst/cdce72010_ctrl_inst/sclk_prebuf -source [get_pins sip_mac_engine_0/mac_engine_inst/brd_clocks_inst/pll0_inst/U0/mmcm_adv_inst/CLKOUT1] -divide_by 16 [get_pins sip_fmc151_0/fmc151_if_inst/cdce72010_ctrl_inst/sclk_prebuf_reg/Q]
create_generated_clock -name sip_fmc151_0/fmc151_if_inst/dac3283_ctrl_inst/sclk_prebuf -source [get_pins sip_mac_engine_0/mac_engine_inst/brd_clocks_inst/pll0_inst/U0/mmcm_adv_inst/CLKOUT1] -divide_by 16 [get_pins sip_fmc151_0/fmc151_if_inst/dac3283_ctrl_inst/sclk_prebuf_reg/Q]
create_generated_clock -name sip_fmc151_0/fmc151_if_inst/fmc151_ctrl_inst/O1 -source [get_pins sip_fmc151_0/fmc151_if_inst/fmc151_ctrl_inst/ext_trigger_prev1_reg/C] -divide_by 1 [get_pins sip_fmc151_0/fmc151_if_inst/fmc151_ctrl_inst/ext_trigger_prev1_reg/Q]
create_generated_clock -name sip_mac_engine_0/mac_engine_inst/ge_mac_stream_inst/eth_mdio_inst/sclk_prebuf -source [get_pins sip_mac_engine_0/mac_engine_inst/brd_clocks_inst/pll0_inst/U0/mmcm_adv_inst/CLKOUT0] -divide_by 16 [get_pins sip_mac_engine_0/mac_engine_inst/ge_mac_stream_inst/eth_mdio_inst/sclk_prebuf_reg/Q]
create_generated_clock -name phy_txc_gtxclk_0 -source [get_pins sip_mac_engine_0/mac_engine_inst/ge_mac_stream_inst/gmii_eth_tx_stream_inst/oddr_tx/C] -divide_by 1 [get_ports phy_txc_gtxclk_0]


set_clock_groups -asynchronous -group [get_clocks -include_generated_clocks clk_ab_p_0] -group [get_clocks -include_generated_clocks clk_to_fpga_p_0] -group [get_clocks -include_generated_clocks sysclk_p_0] -group [get_clocks -include_generated_clocks phy_rxclk_0]

lmunoz November 05, 2014, 08:15 PM (#3)

  • Member
  • *
  • Posts: 160
Perhaps you can look at the data sheet of the ADC and instead of doing the ramp pattern you can do all 1's or all 0's or constant pattern and see if you get a bit flip. With a constant pattern you can trigger on the bit flip. Also try to monitor what is going on with the IDELAY which values is it picking is it consistent? If you are getting random errors once in a while the most likely cause is the IDELAY isn't getting set correctly.
  • « Last Edit: November 05, 2014, 08:24 PM by lmunoz »

tedjnsn November 06, 2014, 07:23 PM (#4)

  • Member
  • *
  • Posts: 47
I put some debugging probes and I can see the glitches appearing as early as in ads62p49_phy. I was not able to probe IDELAY ins and outs due to "routing congestion". It seems to be  more glitchy with debug probes attached.

tedjnsn November 07, 2014, 01:20 PM (#5)

  • Member
  • *
  • Posts: 47
I'm attaching an ILA snap shot of a particular spike during linear pattern generation. It look to me that  iodelay outputs there transition at the same time while there's jitter down stream, in cha_sdr_se bus. I wonder though if the probe can capture idelay jitter, since the data is coming at double the clock rate there?

tedjnsn November 07, 2014, 01:26 PM (#6)

  • Member
  • *
  • Posts: 47
Here's another snapshot with a train of spikes in the same linear sweep capture

lmunoz November 07, 2014, 01:57 PM (#7)

  • Member
  • *
  • Posts: 160
I am not sure I understand what is going in the screen capture, but if I understand correctly the overall issue is that you are trying to port the KC705_FMC151 design into Vivado. We currently don't support that and I can only offer basic advice.

It sounds like you almost have it working expect you get occasional bit flips on the data you are receiving. Usually that happens when the IDELAY tap values aren't set correctly so my advice is first confirm that the alignment machine is working.

So look in ads62p49_phy.vhd, there are these delay values (signal delay_value_a, delay_value_b, delay_value_c). Try reading those out in the software that happens in the function fmc15x_adc_init(). The tap delay can be from 0 to 31. What values are your tap delays set to when you capture data? Are you doing manual or auto training try both and see if you get better results with one or the other.

Try adjusting the following values when doing auto trainning (in fmc15x_adc_init)

case CONSTELLATION_ID_KC705:
            AdcPreTrainingCount = 0;
            clk_delay_training = 2;


Those determine where it will start looking for a valid window. Make the 2 a 0 or a 10. See if that makes a difference.

Try adjusting the following when doing manual training (in main)

    case CONSTELLATION_ID_KC705:
        printf("Found KC705 hardware\n\n");
        tapiod_clk = 0x00; tapiod_data = 0x00;
        break;

Try just incrementing the clock IDELAY value by increments of 5 and see if the data you get starts getting better or worse. That finds the window where the data is good manually. 

Regards,
Luis

tedjnsn November 07, 2014, 04:12 PM (#8)

  • Member
  • *
  • Posts: 47
Luis,


I tried tuning delays manually by changing  taps in the fmc15xapp, but I got somewhat confused by its behavior - when I ask the app to read back the delay values from tap value registers, it returns a different number after each run of the app. I would expect that it should return back the value of the manual tap setting every time, is it right?
  • « Last Edit: November 07, 2014, 04:45 PM by tedjnsn »

tedjnsn November 07, 2014, 04:43 PM (#9)

  • Member
  • *
  • Posts: 47
I'm confused by this piece from fmc15x_adc.cpp that came with BSP:
Code: [Select]
   // Manual training
   if (auto_training == 0) {
      /////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
      // IO Delay Tuning (increment clock IODelay value of +75ps {tapiod_clk} times)
      for (int32_t i = 0; i <tapiod_clk; i++) {
         rc = sipif_writesipreg(bar_adc_phy+0, 0x20);
         if(rc!=SIPIF_ERR_OK)
            return rc;
         Sleep(10);
      }


      // IO Delay Tuning (increment data (Port A and B) IODelay value of +75ps {tapiod_data} times)
      for (int32_t i = 0; i < tapiod_data; i++)  {
         rc = sipif_writesipreg(bar_adc_phy+0, 0x02);
         if(rc!=SIPIF_ERR_OK)
            return rc;
         Sleep(10);
         rc = sipif_writesipreg(bar_adc_phy+0, 0x08);
         if(rc!=SIPIF_ERR_OK)
            return rc;
         Sleep(10);
      }
   }





It looks like the code is writing taps to ADS62P49_PHY_COMMAND register instead of ADS62P49_PHY_INC_A and the bit mask looks odd (?)
  • « Last Edit: November 07, 2014, 04:58 PM by tedjnsn »

lmunoz November 07, 2014, 05:43 PM (#10)

  • Member
  • *
  • Posts: 160
I don't understand that either, looks like a software mistake because we don't use manual mode anymore and we might just remove that code in later releases. Did you try adjusting start values for the auto training :

    case CONSTELLATION_ID_FMC151_KC705:
        printf("Found KC705 hardware\n\n");
        tapiod_clk = 0x00; tapiod_data = 0x05;
        break;




Looking at the documentation in  \star_lib\sip_fmc151\doc\SD224 (sip_fmc151).pdf  and the code for ads62p49_phy.vhd, it seems like to manually increment the TAP delay it should write to offset 2 (0x0012404 + 0x10 + 0x2 = 0x12416), so the code should be

      // IO Delay Tuning (increment clock IODelay value of +75ps {tapiod_clk} times)
      for (int32_t i = 0; i <tapiod_clk; i++) {
         rc = sipif_writesipreg(bar_adc_phy+0x2, 0x4);
         if(rc!=SIPIF_ERR_OK)
            return rc;
         Sleep(10);
      }








  • « Last Edit: November 07, 2014, 05:49 PM by lmunoz »

tedjnsn November 07, 2014, 06:58 PM (#11)

  • Member
  • *
  • Posts: 47
That's the code fix I implemented:


Code: [Select]
if (auto_training == 0) {
//Reset clock buffer and iDelays first
rc = sipif_writesipreg(bar_adc_phy+0, 0x13); Sleep(10);
if(rc!=SIPIF_ERR_OK)
return rc;
/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
// IO Delay Tuning (increment clock IODelay value of +75ps {tapiod_clk} times)
for (int32_t i = 0; i <tapiod_clk; i++) {
rc = sipif_writesipreg(bar_adc_phy+2, 0x4);
if(rc!=SIPIF_ERR_OK)
return rc;
Sleep(10);
}


// IO Delay Tuning (increment data (Port A and B) IODelay value of +75ps {tapiod_data} times)
for (int32_t i = 0; i < tapiod_data; i++)  {
rc = sipif_writesipreg(bar_adc_phy+2, 0x01);
if(rc!=SIPIF_ERR_OK)
return rc;
Sleep(10);
rc = sipif_writesipreg(bar_adc_phy+2, 0x02);
if(rc!=SIPIF_ERR_OK)
return rc;
Sleep(10);
}
}


The auto training machine still  fails to find a good spot, but manual clock tap =5 and data tap = 0 seems to work for Vivado bit file.


What's the source of optimal delay variation between implementations? It seem that they all would have to use the same locations for  I/O primitives on the intake. Does it come from differences in BUFG placement? How to avoid this problem appearing with code modifications, locking bufg in place perhaps? 

lmunoz November 07, 2014, 08:25 PM (#12)

  • Member
  • *
  • Posts: 160
Yes mostly the clocking buffers placements and trace lengths if you are running on different boards. We use auto training because we want the firmware to work with any carrier board. Locking might work.

tedjnsn November 08, 2014, 01:21 PM (#13)

  • Member
  • *
  • Posts: 47
Does proper DAC I/O timing need to be tested as well when upgrading to Vivado or modifying  parts of the design? Or it just works? For example, I would like to double output waveform streaming rate compared to ref design to 245.76 MHZ and reduce DAC interpolation from x4 to x2.

lmunoz November 12, 2014, 01:06 PM (#14)

  • Member
  • *
  • Posts: 160
The software does a pattern check and it should pass if you don't have setup and hold violations. It has always just worked for me with DACs.

For an ADC changing clock frequency almost always required to me find different IDELAY settings.