.. _debug-a-dpcpp-application-on-a-cpu: Debug a SYCL\* Application on a CPU =================================== This section describes a basic scenario of debugging a sample SYCL\* app, Array Transform, with the kernel offloaded to the CPU. Before you proceed, make sure you have completed all necessary setup steps and got a sample code as described in the `Get Started Guide `__. .. _basic_debugging_cpu: .. rubric:: Basic Debugging :class: sectiontitle .. note:: For your convenience, all common Intel® Distribution for GDB\* commands used in examples below are provided in the `reference sheet `__. If you have not already done so, start the debugger: .. code-block:: bash gdb-oneapi array-transform Make sure that the kernel is offloaded to the right device: .. code-block:: bash run cpu Example output: .. code-block:: bash [SYCL] Using device: [Intel® Core™ i7-9750H CPU @ 2.60GHz] from [Intel® OpenCL] Consider the Array Transform sample, which contains a simple kernel - a function that can be offloaded to different devices: .. code-block:: bash 54 h.parallel_for(data_range, [=](id<1> index) { 55 size_t id0 = GetDim(index, 0); 56 int element = in[index]; // breakpoint-here 57 int result = element + 50; 58 if (id0 % 2 == 0) { 59 result = result + 50; // then-branch 60 } else { 61 result = -1; // else-branch 62 } 63 out[index] = result; 64 }); The code processes elements of the input array depending on whether they are even or odd, and produces an output array. Define a breakpoint at line 56: .. code-block:: bash break 56 Expected output: .. code-block:: bash Breakpoint 1 at 0x405800: file /path/to/array-transform.cpp, line 56. .. note:: Do not expect the output you receive will match exactly the one provided in the tutorial. The output may vary due to the nature of parallelism and different machine properties. The ellipsis *[...]* denotes output omitted for brevity. Run the program: .. code-block:: bash run cpu When the thread hits the breakpoint, you should see the following output: .. code-block:: Starting program: cpu [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". [New Thread 0x7ffff37dc700 (LWP 21540)] [New Thread 0x7fffdba79700 (LWP 21605)] [New Thread 0x7fffdb678700 (LWP 21606)] [New Thread 0x7fffdb277700 (LWP 21607)] [SYCL] Using device: [Intel® Core™ i7-7567U CPU @ 3.50GHz] from [Intel® OpenCL] [Switching to Thread 0x7fffdb678700 (LWP 21606)] Thread 4 "array-transform" hit Breakpoint 1, main::$_1::operator()[...] at array-transform.cpp:56 56 int element = in[index]; // breakpoint-here Now you can issue the usual Intel® Distribution for GDB\* commands to inspect the local variables, print a stack trace, and get information on threads. For your convenience, common Intel® Distribution for GDB\* commands are provided in the `reference sheet `__. Keep debugging and display the value of the ``index`` variable: .. code-block:: bash print index Expected output: .. code-block:: bash $1 = cl::sycl::id<1> = {24} Continue program execution: .. code-block:: bash continue You should see the next breakpoint hit event, which comes from another thread. .. code-block:: bash Continuing. [Switching to Thread 0x7fffdba79700 (LWP 21605)] Thread 3 "array-transform" hit Breakpoint 1, main::$_1::operator()[...] at array-transform.cpp:56 56 int element = in[index]; // breakpoint-here If you print the value of the ``index`` variable now: .. code-block:: bash print index The output will differ from the previous one: .. code-block:: bash $2 = cl::sycl::id<1> = {32} To print data elements, use the bracket operator of the accessor: .. code-block:: bash print in[index] Expected output: .. code-block:: bash $3 = 132 You can also print the accessor contents the following ways: - .. container:: :name: LI_897872B372AA4FA0ADE93EC40879F351 :: print in Expected output: :: $4 = {[...], MData = 0x7fffffffd3e0} - .. container:: :name: LI_370178EB8B34456F875706661BED1991 :: x /4dw in.MData​ Expected output: :: 0x7fffffffde30: 100 101 102 103 where the ``x`` command examines the memory contents at the given address and ``/4dw`` specifies that the examination output must contain four items in decimal format, word-length each. .. rubric:: Single Stepping :class: sectiontitle A common debugging activity is single-stepping in the source. The ``step`` and ``next`` commands allow you to step through source lines, stepping into or over function calls. To check the current thread data, run the following command: .. code-block:: bash thread You should get the following output: .. code-block:: bash [Current thread is 3 (Thread 0x7fffdba79700 (LWP 21605))] To check the data of a particular thread, run: .. code-block:: bash info thread 3 Example output: :: Id Target Id Frame * 3 Thread [...] main::$_1::operator()[...] at array-transform.cpp:56 To make Thread 3 move forward by one source line, run: .. code-block:: bash next You should see the following output: .. code-block:: bash [Switching to Thread 0x7fffdb277700 (LWP 21607)] Thread 5 "array-transform" hit Breakpoint 1, main::$_1::operator()[...] at array-transform.cpp:56 56 int element = in[index]; // breakpoint-here Stepping has not occurred. Instead, a breakpoint event from Thread 5 is received and the debugger switched the context to that thread. This happens because you are debugging a multi-threaded program and multiple events may be received from different threads. This is the default behavior, but you can configure it for more efficient debugging. To ensure the current thread executes a single line without interference, set the ``scheduler-locking`` setting to ``on`` or ``step``. This command is useful to keep the other threads stopped while the current thread is stepping (if set to ``step``) or resumed (if ``on``): .. code-block:: bash set scheduler-locking step .. note:: The default value of scheduler-locking is ``replay``. If you set it to ``on`` and you want to resume your program with the ``continue`` command, do not forget to set ``scheduler-locking`` back to ``replay`` or ``off``. Otherwise, only the current thread is resumed. Hence, the recommended value for ``scheduler-locking`` is ``step``. Continue executing the ``next`` command: .. code-block:: bash next You should see the following output: .. code-block:: bash 57 int result = element + 50; Continue executing the  ``next``  command: .. code-block:: bash next You should see the following output: .. code-block:: bash 58 if (id0 % 2 == 0) { To see the value of ``index`` variable, run: .. code-block:: bash print index You should see the following output: .. code-block:: bash $6 = cl::sycl::id<1> = {16} Run: .. code-block:: bash print in[index] The expected output looks as follows: .. code-block:: bash $7 = 116 Finally, run: .. code-block:: bash print result You should see the following output: .. code-block:: bash $8 = 166