AMD เปิดตัวระบบประมวลผลประสิทธิภาพสูง (HPC) ที่เร็วที่สุดในโลกสำหรับการวิจัยทางวิทยาศาสตร์

AMD เปิดตัวระบบประมวลผลประสิทธิภาพสูง (HPC) ที่เร็วที่สุดในโลกสำหรับการวิจัยทางวิทยาศาสตร์

ꟷ โดย AMD Instinct™ MI100 Accelerators จะปฎิวัตการประมวลผลประสิทธิภาพสูง (HPC) และด้านปัญญาประดิษฐ์ (AI) ด้วยเทคโนโลยีระดับแนวหน้าของอุตสาหกรรมด้านประสิทธิภาพการประมวลผล ꟷ

ꟷ กราฟิกการ์ดตัวแรกที่มาพร้อมสถาปัตยกรรมการออกแบบใหม่ “AMD CDNA” สำหรับยุคการประมวลผลระดับ Exascale ꟷ

กรุงเทพฯ, ประเทศไทย – 18 พฤศจิกายน 2563 – AMD (NASDAQ: AMD) เปิดตัวผลิตภัณฑ์กราฟิกการ์ดใหม่ AMD Instinct™ MI100 ที่มีประสิทธิภาพการประมวลผลเร็วที่สุดในโลก สำหรับงานด้านการประมวลผลประสิทธิภาพสูง และเป็นกราฟิกการ์ด x86 ตัวแรกของโลกสำหรับเซิร์ฟเวอร์ ที่ก้าวข้ามขีดจำกัดประสิทธิภาพการประมวลผล 10 teraflops (FP64)¹ พร้อมการสนับสนุนจาก Dell, Gigabyte, HPE และ Supermicro ผลิตภัณฑ์กราฟิกการ์ด AMD Instinct™ MI100 ได้ผสมผสานเข้ากับผลิตภัณฑ์โปรเซสเซอร์ AMD EPYC^TMและแพลตฟอร์มซอฟต์แวร์ระบบเปิด ROCm™ 4.0 ที่ออกแบบมาเพื่อส่งเสริมประสิทธิภาพด้านการค้นคว้าสิ่งใหม่ ๆ ในยุคการประมวลผลระดับ Exascale

กราฟิกการ์ด AMD Instinct™ MI100 สร้างขึ้นบนสถาปัตยกรรมใหม่ AMD CDNA เมื่อจับคู่การทำงานกับโปรเซสเซอร์ 2^nd Gen AMD EPYC ช่วยยกระดับระบบการประมวลผลของงานด้าน HPC และ AI ผลิตภัณฑ์กราฟิกการ์ด AMD Instinct™ MI100 นำเสนอประสิทธิภาพการประมวลผลแบบ FP64 สำหรับงานด้าน HPC ได้สูงสุดถึง 11.5 TFLOPS และประสิทธิภาพการประมวลผลแบบ FP32 Matrix สำหรับเวิร์คโหลดด้าน AI และแมชชีนเลิร์นนิ่ง² ได้สูงสุดถึง 46.1 TFLOPS เทคโนโลยีใหม่ AMD Matrix Core ทำให้กราฟิกการ์ด AMD Instinct™ MI100 สามารถส่งมอบประสิทธิภาพการประมวลผลสูงสุดแบบ FP16 ในระบบแทนจำนวนแบบ Floating-point เพิ่มขึ้นเกือบ 7 เท่า ในเวิร์คโหลดงานด้านการฝึกอบรม AI เมื่อนำไปเปรียบเทียบกับกราฟิกการ์ดรุ่นก่อนหน้าของ AMD³

นายแบรด แมคเครดี้ (Brad McCredie) รองประธานฝ่าย Datacentre GPU และ Accelerated Processing บริษัท AMD กล่าวว่า “วันนี้ AMD ได้ก้าวไปข้างหน้าอย่างยิ่งใหญ่ด้วยเทคโนโลยีการประมวลผลระดับ Exascale จากการที่เราเปิดตัวผลิตภัณฑ์กราฟิกการ์ด AMD Instinct MI100 กราฟิกการ์ดสำหรับการประมวลผลประสิทธิภาพสูงที่เร็วที่สุดในโลก โดยมุ่งเน้นไปที่เรื่องของเวิร์คโหลดงานด้านการประมวลผลทางวิทยาศาสตร์ ด้วยกราฟิกการ์ดรุ่นล่าสุดของเราผนวกรวมเข้ากับแพลตฟอร์มซอฟต์แวร์ระบบเปิด AMD ROCm ซึ่งออกแบบมาเพื่อให้นักวิทยาศาสตร์และนักวิจัยมีปัจจัยพื้นฐานที่ยอดเยี่ยม”

แพลตฟอร์มซอฟต์แวรระบบเปิดสำหรับยุคการประมวลผลระดับ Exascale

ซอฟต์แวร์ AMD ROCm สำหรับนักพัฒนานำเสนอรากฐานในการประมวลผลระดับ Exascales เป็นเครื่องมือสำหรับงานด้านโอเพ่นซอร์สที่ประกอบด้วย คอมไพเลอร์ (compilers), การเขียนโปรแกรมแบบ APIs (programming APIs) และไลบรารี (libraries) โดยนักพัฒนาซอฟต์แวร์ที่ต้องการการประมวลผลระดับ Exascale จะนำ AMD ROCm มาสร้างสรรค์แอปพลิเคชั่นประสิทธิภาพสูง ทั้งนี้ ROCm 4.0 ได้รับการปรับให้เหมาะสมกับการใช้งาน เพื่อส่งมอบประสิทธิภาพที่เหมาะสมสำหรับผู้ใช้กราฟิกการ์ด MI100 โดยซอฟต์แวร์ AMD ROCm 4.0 จะทำการอัพเกรดคอมไพเลอร์เป็นโอเพ่นซอร์สและผนวกเข้าไว้ด้วยกันเพื่อรองรับงานรูปแบบ OpenMP® 5.0 และ HIP อีกทั้งเฟรมเวิร์ค PyTorch และ Tensorflow ที่มีการปรับให้เหมาะสมกับซอฟต์แวร์ AMD ROCm 4.0 ทำให้สามารถบรรลุประสิทธิภาพการประมวลผลที่สูงขึ้นด้วยกราฟิกการ์ด AMD Instinct MI100^7,8 โดย AMD ROCm 4.0 จะเป็นซอฟต์แวร์รุ่นล่าสุดสำหรับนักพัฒนาแอปพลิเคชั่นงานกลุ่ม HPC, ML และ AI ช่วยให้สามารถสร้างซอฟต์แวร์ประเภท Portable software ได้อย่างมีประสิทธิภาพ

นายบรอนสัน เมสเซอร์ (Bronson Messer) ผู้อำนวยการฝ่ายวิทยาศาสตร์ ของ Oak Ridge Leadership Computing Facility กล่าวว่า “เราได้รับสิทธิ์ในการทดลองใช้กราฟิกการ์ด AMD Instinct MI100 และผลลัพธ์เบื้องต้นเป็นที่น่าพอใจอย่างมาก จากประสิทธิภาพการประมวลผลที่เพิ่มขึ้นอย่างเห็นได้ชัดถึง 2-3 เท่า เมื่อนำไปเทียบกับกราฟิกการ์ดรุ่นอื่น ๆ สิ่งสำคัญที่ต้องตระหนักคือซอฟต์แวร์มีผลต่อประสิทธิภาพการประมวลผล ความจริงที่ว่าแพลตฟอร์มซอฟต์แวร์แบบเปิดอย่าง AMD ROCm และเครื่องมือสำหรับนักพัฒนางานรูปแบบ HIP นั้นเป็นโอเพนซอร์สและมีการทำงานบนแพลตฟอร์มที่หลากหลาย เป็นสิ่งที่เราคลุกคลีอยู่กับมันมาตลอดตั้งแต่เราวางระบบโปรเซสเซอร์/กราฟิกการ์ด รูปแบบไฮบริดเป็นครั้งแรก”

คุณสมบัติหลักของกราฟิกการ์ด AMD Instinct MI100 ประกอบด้วย:

สถาปัตยกรรมใหม่ AMD CDNA – ออกแบบมาเพื่อเพิ่มประสิทธิภาพให้กับกราฟิกการ์ด AMD สำหรับยุคการประมวลผลระดับ Exascale และเป็นหัวใจสำคัญของกราฟิกการ์ด AMD Instinct MI100 ทั้งนี้สถาปัตยกรรม AMD CDNA จะนำเสนอประสิทธิภาพการประมวลผลและการใช้พลังงานที่ยอดเยี่ยม
ประสิทธิภาพการประมวลผลโหมด FP64 และ FP32 ระดับชั้นนำสำหรับเวิร์คโหลดงาน HPC – ส่งมอบประสิทธิภาพการประมวลผลระดับชั้นนำของอุตสาหกรรมรูปแบบ FP64 สูงสุดถึง 5 TFLOPS และสูงสุดถึง 23.1 TFLOPS สำหรับประสิทธิภาพการประมวลผลรูปแบบ FP32 ช่วยให้นักวิทยาศาสตร์และนักวิจัยทั่วโลกค้นคว้าสิ่งใหม่ ๆ ในอุตสาหกรรมต่าง ๆ เช่น ชีววิทยาศาสตร์, พลังงาน, การเงิน, วิชาการ, รัฐบาล, การป้องกัน และอื่น ๆ อีกมากมาย¹
เทคโนโลยีใหม่ Martix Core สำหรับงานด้าน HPC และ AI – ประสิทธิภาพยอดเยี่ยมสำหรับการประมวลผล Matrix รูปแบบ full-range of single และ mixed precision เช่น FP32, FP16, bFloat16, Int8 และ Int4 ซึ่งได้รับการออกแบบมาเพื่อเพิ่มการผสานรวมของงานด้าน HPC และ AI
เทคโนโลยี 2^nd Gen AMD Infinity Fabric – กราฟิกการ์ด AMD Instinct MI100 ให้แบนด์วิดท์ I/O แบบ peer-to-peer (P2P) สูงสุดถึง 2 เท่า บนเทคโนโลยี PCIe® 4.0 พร้อมแบนด์วิดท์รวมสูงสุดถึง 340GB/s ต่อกราฟิกการ์ดหนึ่งตัว พร้อมด้วยเทคโนโลยี AMD Infinity Fabric™ Links⁴ ซึ่งภายในเซิร์ฟเวอร์หนึ่งตัว กราฟิกการ์ด MI100 สามารถกำหนดค่าการเชื่อมต่อกราฟิกการ์ดจำนวน 4 ตัวได้ถึงสองแบบ แต่ละแบบจะมีแบนด์วิดท์แบบ P2P I/O สูงถึง 552 GB/s เพื่อการแชร์ข้อมูลที่รวดเร็ว⁴
หน่วยความจำ HBM2 ที่รวดเร็วเป็นพิเศษ – มีหน่วยความจำแบบ HBM2 แบนด์วิดท์สูงขนาด 32GB อัตราสัญญาณนาฬิกา 2 GHz และให้แบนด์วิดท์หน่วยความจำประสิทธิภาพสูงขนาด 1.23TB/s เพื่อรองรับชุดข้อมูลขนาดใหญ่ และขจัดปัญหาคอขวดในด้านการย้ายข้อมูลเข้าและออกจากหน่วยความจำ⁵
รองรับเทคโนโลยี PCIe® Gen 4.0 รุ่นล่าสุด – ออกแบบด้วยเทคโนโลยี PCIe Gen 4.0 รุ่นล่าสุด ให้แบนด์วิดท์การส่งข้อมูลจากโปรเซสเซอร์ไปกราฟิกการ์ดในทางทฤษฎีสูงสุดถึง 64GB/s⁶

ความพร้อมของโซลูชั่นเซิร์ฟเวอร์

คาดว่ากราฟิกการ์ด AMD Instinct MI100 จะพร้อมใช้งานภายในสิ้นปี 2020 จากเหล่าคู่ค้า OEM และ ODM ชั้นนำในกลุ่มตลาดระดับองค์กร รวมถึง:

Dell

นายราวี เพนดีคานที (Ravi Pendekanti) รองประธานอาวุโสฝ่าย PowerEdge Servers บริษัท Dell Technologies กล่าวว่า “เครื่องเซิร์ฟเวอร์ Dell EMC PowerEdge จะรองรับกราฟิกการ์ดใหม่ AMD Instinct MI100 ซึ่งจะช่วยให้สามารถประมวลผลข้อมูลเชิงลึกได้รวดเร็วยิ่งขึ้น ช่วยให้ลูกค้าของเราได้รับผลลัพธ์ของการประมวลผลด้าน HPC และ AI ที่ยอดเยี่ยม และมีประสิทธิภาพมากขึ้นอย่างเห็นได้ชัด AMD เป็นพันธมิตรที่ยอดเยี่ยมของเราในการสนับสนุนการพัฒนาด้านนวัตกรรมของดาต้าเซ็นเตอร์ ประสิทธิภาพระดับสูงที่ยอดเยี่ยมของกราฟิกการ์ด AMD Instinct นั้นเหมาะสมอย่างยิ่งสำหรับกลุ่มงานด้าน HPC และ AI บนเซิร์ฟเวอร์ PowerEdge ของเรา”

Gigabyte

นายอลัน เฉิน (Alan Chen) ผู้ช่วยรองประธานฝ่าย NCBU บริษัท GIGABYTE กล่าวว่า “เรารู้สึกยินดีอย่างยิ่งที่ได้ร่วมงานกับ AMD อีกครั้งในฐานะพันธมิตรเชิงกลยุทธ์ เพื่อนำเสนอฮาร์ดแวร์ให้กับลูกค้ากลุ่มเซิร์ฟเวอร์สำหรับงานด้านการประมวลผลประสิทธิภาพสูง กราฟิกการ์ด AMD Instinct MI100 แสดงให้เห็นถึงพัฒนาการอีกขั้นของเทคโนโลยีการประมวลผลประสิทธิภาพสูงสำหรับ ดาต้าเซ็นเตอร์ นำเสนอการเชื่อมต่อและข้อมูลแบนด์วิดท์ที่มากขึ้นสำหรับงานด้านการวิจัยพลังงาน พลวัตของโมเลกุล และการฝึกการเรียนรู้เชิงลึก การที่เราได้นำกราฟิกการ์ด AMD Instinct MI100 เข้ามาใช้ในเครื่องเซิร์ฟเวอร์ของ GIGABYTE จะทำให้ลูกค้าของเราได้รับประโยชน์จากประสิทธิภาพการประมวลผลที่ดียิ่งขึ้นในเวิร์คโหลดงานด้าน HPC ทางวิทยาศาสตร์และอุตสาหกรรมที่หลากหลาย”

Hewlett Packard Enterprise (HPE)

นายบิลล์ มาเนล (Bill Mannel) รองประธานและผู้จัดการทั่วไปฝ่าย HPC บริษัท HPC กล่าวว่า “ลูกค้าของเราใช้ระบบ HPE Apollo เพื่อใช้ความสามารถที่สร้างขึ้นตามวัตถุประสงค์ และต้องการประสิทธิภาพในการรับมือกับความซับซ้อนในด้านต่าง ๆ และเวิร์คโหลดงานที่ต้องใช้ข้อมูลจำนวนมากในการประมวลผลประสิทธิภาพสูง (HPC) การเรียนรู้เชิงลึก และการวิเคราะห์ จากการเปิดตัวระบบใหม่ HPE Apollo 6500 Gen 10 Plus เรากำลังพัฒนากลุ่มผลิตภัณฑ์ของเราเพื่อปรับปรุงประสิทธิภาพด้านเวิร์คโหลดผ่านกราฟิกการ์ด AMD Instinct MI100 ซึ่งจะช่วยให้สามารถเชื่อมต่อและประมวลผลข้อมูลได้ดียิ่งขึ้น โดยทำงานควบคู่ไปกับโปรเซสเซอร์ 2^nd Gen AMD EPYC™ เราหวังว่าจะได้ร่วมมือกับทาง AMD ต่อไปเพื่อยกระดับข้อเสนอด้านการให้บริการของเราด้วยโปรเซสเซอร์และกราฟิกการ์ดรุ่นล่าสุดของ AMD”

Supermicro

นายวิค มาลยาลา (Vik Malyala) รองประธานอาวุโสฝ่าย Field Application Engineering และ Business Development บริษัท Supermicro กล่าวว่า “เรารู้สึกตื่นเต้นเป็นอย่างมากไปกับ AMD ด้วยผลิตภัณฑ์กราฟิกการ์ด AMD Instinct MI100 ที่นำเสนอประสิทธิภาพด้านการประมวลผลประสิทธิภาพสูงได้อย่างน่าประทับใจ ด้วยการผสมผสานกันของพลังในการประมวลผลกับสถาปัตยกรรมใหม่ AMD CDNA พร้อมด้วยหน่วยความจำประสิทธิภาพสูง และแบนด์วิดท์ของกราฟิกการ์ดแบบ peer-to-peer ที่กราฟิกการ์ด AMD Instinct MI100 นำเสนอมานั้น ลูกค้าของเราจะสามารถเข้าถึงโซลูชั่นที่ยอดเยี่ยม สามารถตอบสนองความต้องการด้านการประมวลผลที่รวดเร็วและเวิร์คโหลดที่สำคัญขององค์กร กราฟิกการ์ด AMD Instinct MI100 จะเป็นส่วนเสริมที่ยอดเยี่ยมสำหรับเซิร์ฟเวอร์ Multi-GPU ของเรา และระบบการให้บริการการประมวลผลประสิทธิภาพสูงและโซลูชั่นการสร้างเซิร์ฟเวอร์ของเรา”

MI100 Specifications

Compute Units

Stream Processors

FP64 TFLOPS (Peak)

FP32 TFLOPS (Peak)

FP32 Matrix TFLOPS

(Peak)

FP16/FP16 Matrix
TFLOPS(Peak)

INT4 | INT8 TOPS

(Peak)

bFloat16 TFLOPs

(Peak)

HBM2
ECC
Memory

Memory Bandwidth

120

7680

Up to 11.5

Up to 23.1

Up to 46.1

Up to 184.6

Up to 92.3 TFLOPS

32GB

Up to 1.23 TB/s

Supporting Resources

Learn more about AMD Instinct™ Accelerators
Learn more about AMD HPC Solutions
AMD HPC Solutions Hub
Learn more about AMD CDNA
Learn more about the AMD 2^nd Gen EPYC™ Processor
Become a fan of AMD on Facebook
Follow AMD on Twitter

เกี่ยวกับ AMD

เป็นเวลากว่า 50 ปีที่ AMD ขับเคลื่อนให้เกิดนวัตกรรมที่มีประสิทธิภาพสูงทั้งในส่วนของการประมวลผลกราฟิก และเทคโนโลยีเวอร์ชวลไลเซชั่นต่าง ๆ ซึ่งเป็นส่วนสำคัญสำหรับวงการเกม เป็นแพลตฟอร์มระดับมืออาชีพ และเป็นศูนย์กลางข้อมูล ผู้บริโภคหลายร้อยล้านคน องค์กรธุรกิจชั้นนำที่จัดอยู่ในกลุ่ม Fortune 500 และหน่วยงานวิจัยทางวิทยาศาสตร์สมัยใหม่ทั่วโลก ต่างใช้เทคโนโลยีของ AMD เพื่อการพัฒนาศักยภาพด้านต่าง ๆ ไม่ว่าจะเป็น การใช้ชีวิต การทำงาน และความบันเทิง พนักงานของ AMD ทุกคนทั่วโลกล้วนมุ่งพัฒนาผลิตภัณฑ์ใหม่ ๆ ที่จะก้าวข้ามขอบเขตของข้อจำกัดทั้งหลาย ท่านสามารถดูข้อมูลเพิ่มเติมเกี่ยวกับ AMD (NASDAQ: AMD) และกระบวนการสร้างสรรค์ต่าง ๆ ที่เราทำในปัจจุบันและที่กำลังจะเกิดขึ้นในอนาคตได้ที่เว็บไซต์ website, blog, Facebook และ Twitter

CAUTIONARY STATEMENT
This press release contains forward-looking statements concerning Advanced Micro Devices, Inc. (AMD) such as the features, functionality, performance, availability, timing and expected benefits of AMD products including the AMD Instinct™ MI100 accelerator, which are made pursuant to the Safe Harbor provisions of the Private Securities Litigation Reform Act of 1995. Forward looking statements are commonly identified by words such as “would,” “may,” “expects,” “believes,” “plans,” “intends,” “projects” and other terms with similar meaning. Investors are cautioned that the forward-looking statements in this press release are based on current beliefs, assumptions and expectations, speak only as of the date of this press release and involve risks and uncertainties that could cause actual results to differ materially from current expectations. Such statements are subject to certain known and unknown risks and uncertainties, many of which are difficult to predict and generally beyond AMD’s control, that could cause actual results and other future events to differ materially from those expressed in, or implied or projected by, the forward-looking information and statements. Material factors that could cause actual results to differ materially from current expectations include, without limitation, the following: Intel Corporation’s dominance of the microprocessor market and its aggressive business practices; the ability of third party manufacturers to manufacture AMD’s products on a timely basis in sufficient quantities and using competitive technologies; expected manufacturing yields for AMD’s products; the availability of essential equipment, materials or manufacturing processes; AMD’s ability to introduce products on a timely basis with features and performance levels that provide value to its customers; global economic uncertainty; the loss of a significant customer; AMD’s ability to generate revenue from its semi-custom SoC products; the impact of the COVID-19 pandemic on AMD’s business, financial condition and results of operations; political, legal, economic risks and natural disasters; the impact of government actions and regulations such as export administration regulations, tariffs and trade protection measures; the impact of acquisitions, joint ventures and/or investments on AMD’s business, including the announced acquisition of Xilinx, and the failure to integrate acquired businesses; AMD’s ability to complete the Xilinx merger; the impact of the announcement and pendency of the Xilinx merger on AMD’s business; potential security vulnerabilities; potential IT outages, data loss, data breaches and cyber-attacks; uncertainties involving the ordering and shipment of AMD’s products; quarterly and seasonal sales patterns; the restrictions imposed by agreements governing AMD’s notes and the revolving credit facility; the competitive markets in which AMD’s products are sold; market conditions of the industries in which AMD products are sold; AMD’s reliance on third-party intellectual property to design and introduce new products in a timely manner; AMD’s reliance on third-party companies for the design, manufacture and supply of motherboards, software and other computer platform components; AMD’s reliance on Microsoft Corporation and other software vendors’ support to design and develop software to run on AMD’s products; AMD’s reliance on third-party distributors and add-in-board partners; the potential dilutive effect if the 2.125% Convertible Senior Notes due 2026 are converted; future impairments of goodwill and technology license purchases; AMD’s ability to attract and retain qualified personnel; AMD’s ability to generate sufficient revenue and operating cash flow or obtain external financing for research and development or other strategic investments; AMD’s indebtedness; AMD’s ability to generate sufficient cash to service its debt obligations or meet its working capital requirements; AMD’s ability to repurchase its outstanding debt in the event of a change of control; the cyclical nature of the semiconductor industry; the impact of modification or interruption of AMD’s internal business processes and information systems; compatibility of AMD’s products with some or all industry-standard software and hardware; costs related to defective products; the efficiency of AMD’s supply chain; AMD’s ability to rely on third party supply-chain logistics functions; AMD’s stock price volatility; worldwide political conditions; unfavorable currency exchange rate fluctuations; AMD’s ability to effectively control the sales of its products on the gray market; AMD’s ability to adequately protect its technology or other intellectual property; current and future claims and litigation; potential tax liabilities; and the impact of environmental laws, conflict minerals-related provisions and other laws or regulations. Investors are urged to review in detail the risks and uncertainties in AMD’s Securities and Exchange Commission filings, including but not limited to AMD’s Quarterly Report on Form 10-Q for the quarter ended September 26, 2020.

©2020 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, EPYC, AMD Instinct, Infinity Fabric, ROCm and combinations thereof are trademarks of Advanced Micro Devices, Inc. The OpenMP name and the OpenMP logos are registered trademarks of the OpenMP Architecture Review Board. PCIe is a registered trademark of PCI-SIG Corporation. Python is a trademark of the Python Software Foundation. PyTorch is a trademark or registered trademark of PyTorch. TensorFlow, the TensorFlow logo and any related marks are trademarks of Google Inc. Other product names used in this publication are for identification purposes only and may be trademarks of their respective companies.

Calculations conducted by AMD Performance Labs as of Sep 18, 2020 for the AMD Instinct™ MI100 (32GB HBM2 PCIe® card) accelerator at 1,502 MHz peak boost engine clock resulted in 11.54 TFLOPS peak double precision (FP64), 46.1 TFLOPS peak single precision matrix (FP32), 23.1 TFLOPS peak single precision (FP32), 184.6 TFLOPS peak half precision (FP16) peak theoretical, floating-point performance. Published results on the NVidia Ampere A100 (40GB) GPU accelerator resulted in 9.7 TFLOPS peak double precision (FP64). 19.5 TFLOPS peak single precision (FP32), 78 TFLOPS peak half precision (FP16) theoretical, floating-point performance. Server manufacturers may vary configuration offerings yielding different results. MI100-03
Calculations performed by AMD Performance Labs as of Sep 3, 2020 on the AMD Instinct™ MI100 (32GB HBM2 PCIe® card) accelerator at 1,502 MHz peak engine clock resulted in 46.1 TFLOPS peak theoretical single precision (FP32 Matrix) Math floating-point performance. The Nvidia Ampere A100 (40GB) GPU accelerator published results are 19.5 TFLOPS peak single precision (FP32) floating-point performance. Nvidia results found at: https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf. Server manufacturers may vary configuration offerings yielding different results. MI100-01
Calculations performed by AMD Performance Labs as of Sep 18, 2020 for the AMD Instinct™ MI100 accelerator at 1,502 MHz peak boost engine clock resulted in 184.57 TFLOPS peak theoretical half precision (FP16) and 46.14 TFLOPS peak theoretical single precision (FP32 Matrix) floating-point performance. The results calculated for Radeon Instinct™ MI50 GPU at 1,725 MHz peak engine clock resulted in 26.5 TFLOPS peak theoretical half precision (FP16) and 13.25 TFLOPS peak theoretical single precision (FP32 Matrix) floating-point performance. Server manufacturers may vary configuration offerings yielding different results. MI100-04
Calculations as of SEP 18th, 2020. AMD Instinct™ MI100 built on AMD CDNA technology accelerators supporting PCIe® Gen4 providing up to 64 GB/s peak theoretical transport data bandwidth from CPU to GPU per card. AMD Instinct™ MI100 accelerators include three Infinity Fabric™ links providing up to 276 GB/s peak theoretical GPU to GPU or Peer-to-Peer (P2P) transport rate bandwidth performance per GPU card. Combined with PCIe Gen4 support providing an aggregate GPU card I/O peak bandwidth of up to 340 GB/s. MI100s have three links: 92 GB/s * 3 links per GPU = 276 GB/s. Four GPU hives provide up to 552 GB/s peak theoretical P2P performance. Dual 4 GPU hives in a server provide up to 1.1 TB/s total peak theoretical direct P2P performance per server. AMD Infinity Fabric link technology not enabled: Four GPU hives provide up to 256 GB/s peak theoretical P2P performance with PCIe® 4.0. Server manufacturers may vary configuration offerings yielding different results. MI100-07
Calculations by AMD Performance Labs as of Oct 5th, 2020 for the AMD Instinct™ MI100 accelerator designed with AMD CDNA 7nm FinFET process technology at 1,200 MHz peak memory clock resulted in 1.2288 TFLOPS peak theoretical memory bandwidth performance. The results calculated for Radeon Instinct™ MI50 GPU designed with “Vega” 7nm FinFET process technology with 1,000 MHz peak memory clock resulted in 1.024 TFLOPS peak theoretical memory bandwidth performance. CDNA-04
Works with PCIe® Gen 4.0 and Gen 3.0 compliant motherboards. Performance may vary from motherboard to motherboard. Refer to system or motherboard provider for individual product performance and features.
Testing Conducted by AMD performance labs as of October 30th, 2020, on three platforms and software versions typical for the launch dates of the Radeon Instinct MI25 (2018), MI50 (2019) and AMD Instinct MI100 GPU (2020) running the benchmark application Quicksilver. MI100 platform (2020): Gigabyte G482-Z51-00 system comprised of Dual Socket AMD EPYC™ 7702 64-Core Processor, AMD Instinct™ MI100 GPU, ROCm™ 3.10 driver, 512GB DDR4, RHEL 8.2. MI50 platform (2019): Supermicro® SYS-4029GP-TRT2 system comprised of Dual Socket Intel Xeon® Gold® 6132, Radeon Instinct™ MI50 GPU, ROCm 2.10 driver, 256 GB DDR4, SLES15SP1. MI25 platform (2018): Supermicro SYS-4028GR-TR2 system comprised of Dual Socket Intel Xeon CPU E5-2690, Radeon Instinct™ MI25 GPU, ROCm 2.0.89 driver, 246GB DDR4 system memory, Ubuntu 16.04.5 LTS. MI100-14
Testing Conducted by AMD performance labs as of October 30th, 2020, on three platforms and software versions typical for the launch dates of the Radeon Instinct MI25 (2018), MI50 (2019) and AMD Instinct MI100 GPU (2020) running the benchmark application TensorFlow ResNet 50 FP 16 batch size 128. MI100 platform (2020): Gigabyte G482-Z51-00 system comprised of Dual Socket AMD EPYC™ 7702 64-Core Processor, AMD Instinct™ MI100 GPU, ROCm™ 3.10 driver, 512GB DDR4, RHEL 8.2. MI50 platform (2019): Supermicro® SYS-4029GP-TRT2 system comprised of Dual Socket Intel Xeon® Gold® 6254, Radeon Instinct™ MI50 GPU, ROCm 3.0.6 driver, 338 GB DDR4, Ubuntu® 16.04.6 LTS. MI25 platform (2018): a Supermicro SYS-4028GR-TR2 system comprised of Dual Socket Intel Xeon CPU E5-2690, Radeon Instinct™ MI25 GPU, ROCm 2.0.89 driver, 246GB DDR4 system memory, Ubuntu 16.04.5 LTS. MI100-15