Preview only show first 10 pages with watermark. For full document please download

Intel Optimization Of Pxa27x

Optimization Techniques for PXA27x

   EMBED


Share

Transcript

  White Paper Optimization Technology for theIntel ® PXA27x Processor Family  Performance and Power Savings for Wireless System and Application Development  Table of Contents 1.Introduction32.Optimizing the Intel ® PXA27x Processor’s Performance via the BSP3 2.1 Optimizing for On-chip SRAM42.2 Optimizing for the Enhanced Memory Subsystem42.3 Optimizing for the Bus Transaction Arbiter4 3.Intel ® Wireless MMX  ™ Technology4 3.1 Enabling Intel ® Wireless MMX ™ Technology53.2 Writing Intel ® Wireless MMX ™ Technology Code6 4.Intel ® Quick Capture Technology65.Wireless Intel SpeedStep ® Technology76.Intel ® Integrated Performance Primitives (Intel ® IPP)97.Intel ® Software Development Tools11 7.1 Compilers117.2 Debuggers11 8.Intel ®  VTune ™ Performance Analyzer129.Intel ® PCA Developer Network1210.Summary13 Appendix A. Overview of Key System Optimizations14  1. Introduction  The Intel ® Personal Internet Client Architecture (Intel ® PCA)PXA27x processor family offers developers a new generation of ultra-low power and industry-leading multimedia performanceon silicon. Intel has integrated a host of new features in the Intel PXA27x processor family to enable this level of power and performance, including: ■ Intel ® Wireless MMX  ™ technology ■ Wireless Intel SpeedStep ® technology ■ Intel ® Quick Capture technology ■ Up to 624MHz core speed ■ Enhanced memory subsystem ■ Intelligent bus transaction arbiter ■ 256K of on-chip SRAM Intel’s complete suite of development components are designedto help customers to take full advantage of cutting-edgetechnologies and get the best power and performance frommobile devices. These technologies also allow IndependentSoftware Vendors (ISVs) to fully tune their applications.  This paper describes a typical development cycle and how totake advantage of the optimization technologies available fromIntel. The key features this paper addresses are: ■ Operating System Board Support Packages (BSPs) —Intel provides BSPs for a variety of operating systemsincluding Linux*, Microsoft Windows* Mobile for PocketPCsand Smartphones, Microsoft Windows* CE .NET, Palm* OS,and Symbian* OS. The BSPs include the latest optimizationsand drivers for the Intel PXA27x processor family and make it easy for customers to create a BSP customized for theirown mobile device. ■ Intel ® Wireless MMX  ™ Technology  —an advanced set of multimedia instructions that brings desktop-like multimediaperformance to Intel PXA27x processor-based clients, whileminimizing the power needed to run rich applications. ■ Wireless Intel SpeedStep ® Technology  —allows customerto dynamically adjust the power and performance of theprocessor based on CPU demand. This can significantlydecrease power consumption in wireless handheld devices. ■ Intel ® Quick Capture Technology  —provides the ability toget live video and high-quality still images from a wide rangeof camera sensors in current and future camera-enabledmobile handsets and PDAs. ■ Intel ® Integrated Performance Primitives (Intel ® IPP) —a cross-platform software library that allows users to writeoptimized applications that utilize Intel Wireless MMX technologyto maximize performance on the Intel PXA27x processor. ■ Intel ® Software Development Tools (Intel ® SDT) —provides both an optimizing compiler and a set of sophisticated, high-level language debuggers to helpsoftware run at top speed. ■ Intel ®  VTune ™  Analyzer —this tool lets users profileapplications for hotspots of activity. A tuning assistantprovides support to optimize C/C++ code and/or assemblersequences. ■ Intel ® PCA Developer Network —provides information onthird-party software applications that are already optimized,as well as optimization labs and support to answerquestions. With over 1,000 companies and over 3,000different software and hardware solutions, the Intel PCA Developer Network can help customers find value-addsolutions for mobile devices.  This paper introduces Intel ® optimization technologies andaddress how each fit into a typical development cycle consistingof iterations of coding, optimizing, and profiling. Devices that takeadvantage of these optimizations achieve significant performanceimprovements and power savings over those that do not.  Applications that take advantage of these optimizations will runfaster and more efficiently on the Intel PXA27x processor-baseddevices. Pointers to additional resources that provide moredetailed information on each technology are provided at the end of this paper. 2. Optimizing the Intel ® PXA27xProcessor’s Performance via the BSP  To ensure that designs and applications take full advantage of the technology in the Intel PXA27x processor, OEM and ODMsshould make sure that the device BSP supports these features.Intel provides BSPs for a variety of operating systems includingLinux, Microsoft Windows Mobile for PocketPCs andSmartphones, Microsoft Windows CE .NET, Palm OS, andSymbian OS. The BSPs include the latest optimizations anddrivers for the Intel PXA27x processor and make it easy forcustomers to create a customized BSP. The BSPs for the Intel PXA27x processor contain an extensivenumber of optimizations, the latest versions of which can beobtained from Intel field sales representatives. While a full list 3 White Paper Optimization Technology for the Intel ® PXA27x Processor  White Paper of the available optimized drivers is beyond the scope of thispaper, several key optimizations for the Intel PXA27x processorare described here, including: ■ Optimizing for 256K of on-chip SRAM ■ Enabling and utilizing the enhanced memory subsystem ■  Taking advantage of the bus-transaction arbiter 2.1 Optimizing for On-chip SRAM  The internal SRAM can be used for frame buffers as well asstorage of variables or data to be processed. The SRAM has afast access time, and is powered from the VCC_SRAM domain,offering both lower power and higher performance than usingexternal memory. Example: a 320x240x16 bit-per-pixel frame buffer consumes154K of memory, allowing the rest to be used for temporarystorage of MPEG-4* video buffers, a Java* virtual machineheap, incoming data from the Intel ® Quick Capture camerainterface, executable code, streamed data, or other variablesthat need to be accessed quickly.  The SRAM is comprised of four independently controllable 64K banks. When entering sleep or deep-sleep mode, one or more banks can remain powered on. This retains the OSstate so context can be restored quickly upon wakeup fromthose modes. 2.2 Optimizing for the Enhanced MemorySubsystem  The Intel PXA27x processor family enhances and adds flexibilityto the bus settings of the Intel ® PXA255 processor family, whichsupported a 200-MHz system bus at core speed of 400MHz. The internal system bus in the Intel PXA27x processor can runup to 208MHz using fast bus mode at many other productpoints, including 312 and 208MHz by setting the CLKCFG[B]bit to one. As a result, applications on the Intel PXA27xprocessor can offer better performance at the lower frequency settings.  The memory controller offers flexibility to run at greater speedsthan before by setting the CCCR[A] bit to one. This helpsreduce latency and increases bandwidth to memory, offeringbetter system performance. 2.3 Optimizing for the Bus Transaction Arbiter   The bus arbiter in the Intel PXA27x processor performs thearbitration for internal-bus-access transactions, which isprogrammable through the ARB_CNTRL register. The IntelPXA27x processor system bus supports six clients—the core,the DMA controller, the LCD controller, the USB host controller,and both an internal and external memory controller. Customerscan program priority weights for each of these clients via thearbiter-control register, which enables fine-tuning of deviceperformance based on the typical usage model for that device. Example: a device is designed to encode MPEG-4 video usingIntel Quick Capture interface in the Intel PXA27x processor and stream it over USB Host to an attached USB Client. Byassigning higher priority weights to the core and the USB hostcontroller, improving performance of processing the MPEG-4video stream (CPU intensive) and transmitting it via USB Host.Customers are encouraged to test the performance benefits of applying different priority weights to these clients based on thesupported usage model. Further performance can be gained by speculatively ’parking’ a specific client on the arbiter. This means that the arbiter willalways start with that client when internal-bus-accesstransactions are performed. Setting the arbiter-control registerto park on the core often results in the best performance.  This is only an overview of the key features that should be inyour BSP or in your application. Other features are listed in the other sections of this paper. For more information, consult the following documentation: ■ The Intel  ® PXA27x Processor Optimization Guide ■ The Intel  ® PXA27x Processor Developer Manual Volume I of III ■ The Intel  ® PXA27x Processor Developer Manual Volume II of III ■ The Intel  ® PXA27x Processor Developer Manual Volume III of III 3. Intel ® Wireless MMX  ™ Technology  Introduced in 2003, Intel Wireless MMX technology is an advancedset of multimedia instructions that help bring desktop-likemultimedia performance to Intel PXA27x processor-based clients,while minimizing the power needed to run rich applications. 4 Optimization Technology for the Intel ® PXA27x Processor