[OpenCV实战]27 在OpenCV下使用forEach进行并行像素访问

阅读原文时间：2023年07月08日阅读：2

1 Mat像素访问

1.1 使用at方法直接进行像素访问

1.2 使用指针进行像素访问

1.3 使用forEach方法进行像素访问

1.4 将forEach与C ++ 11 Lambda一起使用

C++11扩展了for语句的语法。用这个新写法forEach，forEach可以遍历C类型的数组、初始化列表以及任何重载了非成员的begin()和end()函数的类型。OpenCV的Mat数据结构中有用到ForEach的写法。在本教程中，我们将比较Mat类的forEach方法与OpenCV中访问和转换像素值的其他方法的性能。我们将展示forEach如何比使用at方法或甚至有效地使用指针算法更快。因此本文只有C++ forEach用法介绍。Python下实现容易多，搜索即可。

OpenCV中有隐藏的功能，有时候并不是很有名。其中一个隐藏的功能是Mat类的forEach方法，它利用机器上的所有核心在每个像素上处理任何功能。

我们先来定义一个函数complexThreshold。它接收RGB像素值并对其应用复杂的阈值分割。代码如下：

// Define a pixel
typedef Point3_<uint8_t> Pixel;

// A complicated threshold is defined so
// a non-trivial amount of computation
// is done at each pixel.
void complicatedThreshold(Pixel &pixel)
{
  if (pow(double(pixel.x)/10,2.5) > 100)
  {
    pixel.x = 255;
    pixel.y = 255;
    pixel.z = 255;
  }
  else
  {
    pixel.x = 0;
    pixel.y = 0;
    pixel.z = 0;
  }
}

与简单阈值相比，该函数在计算量要多得多。这样我们不仅可以测试像素访问时间，还可以了解每个像素操作在计算量很大时forEach如何使用CPU所有核心。接下来，我们将介绍将四种不同的方法应用于图像中的每个像素并检查相对性能。

1 Mat像素访问

Mat类有一个方便的方法，用于访问图像中位置（行，列）的像素。以下代码使用at方法访问每个像素并对其应用complexThreshold。代码如下：

    //循环测试numTrials次
    for (int n = 0; n < numTrials; n++)
    {
        // Naive pixel access at方法直接读取数据
        // Loop over all rows 遍历行
        for (int r = 0; r < image.rows; r++)
        {
            // Loop over all columns 遍历列
            for (int c = 0; c < image.cols; c++)
            {
                // Obtain pixel at (r, c) 直接访问像素数据
                Pixel pixel = image.at<Pixel>(r, c);
                // Apply complicatedTreshold 阈值分割
                complicatedThreshold(pixel);
                // Put result back 保存结果
                image.at<Pixel>(r, c) = pixel;
            }
        }
    }

上述方法被认为是低效的，因为每次调用at方法时都会计算存储器中像素的位置。这涉及乘法运算，而不使用像素位于连续的存储器块中相关特性。

在OpenCV中，一行中的所有像素都存储在一个连续的内存块中。如果使用create创建 Mat对象，则所有像素都存储在一个连续的内存块中。由于我们正在从磁盘读取图像的imread方法会使用create方法创建一个Mat对象，因此我们可以使用不需要乘法，而通过简单指针算法简单地遍历所有像素。代码如下：

    //通过指针访问像素点，类似YUV图像处理，前提图像存储是连续的
    for (int n = 0; n < numTrials; n++)
    {
        // Get pointer to first pixel
        //初始指针
        Pixel *pixel = image1.ptr<Pixel>(0, 0);

        // Mat objects created using the create method are stored
        // in one continous memory block.
        // 访问像素点位置
        const Pixel *endPixel = pixel + image1.cols * image1.rows;

        // Loop over all pixels
        for (; pixel != endPixel; pixel++)
        {
            complicatedThreshold(*pixel);
        }
    }

这种方式是很有效的一种方法，实际较为常用，但是速度并没有达到最优，比at快不了多少，而且指针直接操作容易出错。

Mat类的forEach方法接受一个函数运算符Operator。用法如下：

void cv::Mat::forEach   (const Functor &operation)

理解上述用法的最简单方法是通过下面的示例。我们定义了一个与forEach一起使用的函数对象（Operator）。代码如下：

// Parallel execution with function object.
struct Operator
{
  void operator ()(Pixel &pixel, const int * position) const
  {
    // Perform a simple threshold operation
    complicatedThreshold(pixel);
  }
};

调用forEach很简单，只需一行代码即可完成

// Call forEach
image2.forEach<Pixel>(Operator());

这种方法速度很快，操作很简单。

Lambda是C++11的新特性，具体使用见：

https://blog.csdn.net/lixiaogang_theanswer/article/details/80905445

代码如下：

    for (int n = 0; n < numTrials; n++)
    {
        // Parallel execution using C++11 lambda.
        image3.forEach<Pixel>([](Pixel &pixel, const int *position) -> void {
            complicatedThreshold(pixel);
        });
    }

这种方式就不需要创建函数运算符，速度相比forEach不相上下。

2 性能比较与代码

通过函数complicatedThreshold处理大小9000X6750的大图像。实验中使用的2.3 GHz Intel Core i5处理器有四个内核。获得以下时间。请注意，使用forEach使代码比使用Naive Pixel Access或Pointer Arithmetic方法快五倍。

方法

时间/ms

at方法

10960.8

指针

10171.9

forEach

2686.1

forEach (C++11 Lambda)

2747.2

如果是处理300X225的小图像时，结果如下：

方法

时间/ms

at方法

13.2

指针

11.3

forEach

4.6

forEach (C++11 Lambda)

2.9

可以看到小图像或大图像使用指针算法和at直接访问效果差距不大。而直接使用forEach适合大图像，forEach+Lambda特性更适合于小图像。用Lamdba特性处理小图像要比forEach处理快的原因在于，lambda特性更适用于不太耗时的操作使用，如普通for循环，纯CPU计算类型的操作，函数处理时间少的情况。数据库的IO操作，多线程充分利用CPU资源，lambda就不那么适合，可能时间开销更大。

所有代码见：

https://github.com/luohenyueji/OpenCV-Practical-Exercise

C++：

#include "pch.h"
#include <opencv2/opencv.hpp>

// Use cv and std namespaces
using namespace cv;
using namespace std;

// Define a pixel 定义Pixel结构
typedef Point3_<uint8_t> Pixel;

/**
 * @brief tic is called to start timer 开始函数运行时间计算
 *
 * @param t
 */
void tic(double &t)
{
    t = (double)getTickCount();
}

/**
 * @brief toc is called to end timer 结束函数运行时间计算
 *
 * @param t
 * @return double 返回值运行时间ms
 */
double toc(double &t)
{
    return ((double)getTickCount() - t) / getTickFrequency() * 1000;
}

/**
 * @brief 阈值分割
 *
 * @param pixel
 */
void complicatedThreshold(Pixel &pixel)
{
    //x,y,z分别代表三个通道的值
    if (pow(double(pixel.x) / 10, 2.5) > 100)
    {
        pixel.x = 255;
        pixel.y = 255;
        pixel.z = 255;
    }
    else
    {
        pixel.x = 0;
        pixel.y = 0;
        pixel.z = 0;
    }
}

/**
 * @brief Parallel execution with function object. 并行处理函数结构体
 *
 */
struct Operator
{
    //处理函数
    void operator()(Pixel &pixel, const int *position) const
    {
        // Perform a simple threshold operation
        complicatedThreshold(pixel);
    }
};

int main()
{
    // Read image 读图
    Mat image = imread("./image/butterfly.jpg");

    // Scale image 30x 将图像扩大为30倍，长宽都变大30倍
    resize(image, image, Size(), 30, 30);

    // Print image size 打印图像尺寸
    cout << "Image size " << image.size() << endl;

    // Number of trials 测试次数
    int numTrials = 5;

    // Print number of trials 测试次数
    cout << "Number of trials : " << numTrials << endl;

    // Make two copies 图像复制
    Mat image1 = image.clone();
    Mat image2 = image.clone();
    Mat image3 = image.clone();

    // Start timer 时间函数,单位为ms
    double t;
    //开始计算时间
    tic(t);

    //循环测试numTrials次
    for (int n = 0; n < numTrials; n++)
    {
        // Naive pixel access at方法直接读取数据
        // Loop over all rows 遍历行
        for (int r = 0; r < image.rows; r++)
        {
            // Loop over all columns 遍历列
            for (int c = 0; c < image.cols; c++)
            {
                // Obtain pixel at (r, c) 直接访问像素数据
                Pixel pixel = image.at<Pixel>(r, c);
                // Apply complicatedTreshold 阈值分割
                complicatedThreshold(pixel);
                // Put result back 保存结果
                image.at<Pixel>(r, c) = pixel;
            }
        }
    }
    //计算函数执行时间
    cout << "Naive way: " << toc(t) << endl;

    // Start timer
    tic(t);

    // image1 is guaranteed to be continous, but
    // if you are curious uncomment the line below
    //需要判断图像连续存储，1表示图像连续，0不连续
    //cout << "Image 1 is continous : " << image1.isContinuous() << endl;

    //通过指针访问像素点，类似YUV图像处理，前提图像存储是连续的
    for (int n = 0; n < numTrials; n++)
    {
        // Get pointer to first pixel
        //初始指针
        Pixel *pixel = image1.ptr<Pixel>(0, 0);

        // Mat objects created using the create method are stored
        // in one continous memory block.
        // 访问像素点位置
        const Pixel *endPixel = pixel + image1.cols * image1.rows;

        // Loop over all pixels
        for (; pixel != endPixel; pixel++)
        {
            complicatedThreshold(*pixel);
        }
    }
    cout << "Pointer Arithmetic " << toc(t) << endl;

    tic(t);
    //forEach遍历像素
    for (int n = 0; n < numTrials; n++)
    {
        image2.forEach<Pixel>(Operator());
    }
    cout << "forEach : " << toc(t) << endl;

    //C++版本
    cout << __cplusplus << endl;

    //使用C++11 lambda特性
    tic(t);
    for (int n = 0; n < numTrials; n++)
    {
        // Parallel execution using C++11 lambda.
        image3.forEach<Pixel>([](Pixel &pixel, const int *position) -> void {
            complicatedThreshold(pixel);
        });
    }
    cout << "forEach C++11 : " << toc(t) << endl;

    return 0;
}

3 参考

https://www.learnopencv.com/parallel-pixel-access-in-opencv-using-foreach/

手机扫一扫

移动阅读更方便

你可能感兴趣的文章