According to IDC, IoT-generated data will reach 80 zettabytes by 2025. How can enterprises make all this data actionable? For many applications, the answer lies in “edge computing,” which puts IT service environments and cloud-computing capabilities at the edge of the network, reducing the need to send data to distant servers in data centers and the cloud.

With more data generated at the edge, AI-based workloads are rapidly moving to the edge, changing the dynamics of how to manage compute at the edge.

What is MEC? Multi-access Edge Computing (MEC) is a type of edge computing that uses cellular networks for primary connectivity. 5G and MEC are appealing for many applications that demand low latencies, faster processing, and a better user experience.

lmmersive Experience (AR/VR and Mixed Reality)
Automotive and Transportation (connected vehicles)
Medical and Hospitals (intelligent medical devices)
Remote Operation (factory and hospitals)
Intelligent Automation (industries and smart cities

AI at Edge Compute (GPU)

The ability to provide AI-on-5G depends on optimized accelerated computing (like GPUs) and data centers at the edge where they are needed. This means GPUs are now deployed in MEC at 5G RAN. Nvidia EGX servers are an example of servers with GPUs deployed at 5G RAN. GPUs provide the acceleration to serve AI applications in the fastest possible way, but, as stated above, there are certain challenges: limited GPU compute availability, high total cost of ownership (TCO), and dynamically scaling GPUs based on workloads.

Compute (GPU) Challenges at MEC

Unlike clouds and data centers, MEC has less compute power to service applications with low latency, performance, and power constraints, severely restricting application and data processing.

Compute scarcity: Cannot scale and add compute like clouds.

High TCO: GPUs are expensive to compute and often underutilized.

Serialized application servicing: Cannot effectively run multiple applications in parallel.

Varied applications with varying workload demand spikes (AR/VR, medical, connected vehicles, etc.). Each application has a different workload profile.

Bottlenecks and disruptions: Compute allocations are static with preset shares. For example, AR/VR gets 40% of compute, connected vehicles get 30%. Any change in application demand can lead to delays and disruptions.

raptlQ™ + 5G + AI + GPU
raptlQ™ is an end-to-end adaptive compute workflow automation platform to power ML apps with “Shareable, Elastic, Optimized compute for ANY AI App/model, ANY chip, ANYwhere.” It is Kubernetes-integrated and works seamlessly with any management frameworks and servers.
raptlQ™ provides a dynamic multi-tenant application compute workflow for applications at edge compute, delivering superior performance for your applications at the MEC.

raptlQ™ auto-manages the resources both at Edge and Core (cloud/Data Centers), enabling smart placement of application workloads in limited compute at MEC and extending the backhaul to core automatically and transparently, delivering the best TCO and guaranteed performance for apps running at the edge. It maximizes resource utilization at the edge for AI workloads at both Edge and Core, optimizing CAPEX expenditure.

End-to-end compute workflow orchestration –
raptlQ™ transparently manages resources at both edge and cloud as one entity.

More work with less compute –
raptlQ™ provides the ability to run many more applications with the provided compute.

Reduce backhaul to cloud –
Any application failing to get enough compute will result in backhaul to cloud. raptlQ™ engine’s ability with “guaranteed quotas” enables apps to run at MEC itself, reducing backhaul to the cloud automatically.

Reduced latency –
raptlQ™ runs applications in parallel, eliminating serialized apps at MEC with high latency and application delays.

Compute availability –
raptlQ™ guaranteed quotas ensure that compute is always available to apps at MEC.

Auto GPU sharing –
MEC has limited compute (unlike cloud). raptlQ™ shares GPU compute among all running applications.

Workload-based compute allocations –
raptlQ™ learns the workload pattern and assigns the exact share required.

Zero user intervention –
No presets of compute shares are required as the workloads are dynamic and unpredictable.