复杂Excel转换与导入
阅读原文时间:2023年07月11日阅读:1

  把不同客户提供Excel 直接导入到系统中生成对应的收货单或是出货单。后端创建收货端和出货单的接口已经有现成的webservice或是标准的xml;这类需要做的就是把客户提供不同种类的Excel mapping成标准格式。

  不同格式的excel如何找到对应的数据项,比如一个Excel中需要字段分别在不同的sheet或是不同的位置上。

  第一个定位配置信息,sheet-name:数据从哪个sheet中读取,默认sheet1,start-tag:固定标识,查找Excel中一些特殊文本信息来定位具体的celladdress(行,列),data-offset:设置一个偏移量,在找到具体内容的地址后可能真正需要数据在后面,那就需要设置一个偏移量待读取信息, end-tag:结束位置,用于循环读取的范围。

第二个映射字段明,XmlNode Name:就是目标字段明,data-field:Excel中对应的字段名称(含有表头的行),data-type:目标字段的类型,data-formatter:格式化截取excel Cell中的内容 比如:需要通过substring,或splitl来取其中的内容。

  第三个表示单个表头,还是循环的表体 replicate="true" 表示需要循环读取。

先根据配置规则把Excel中需要的信息提取出来并生成一个XML文档,如果标准的接口XML结构和数据都比较复杂,那么还需要使用XSLT语言来做更复杂的mapping,满足后端服务的要求。

原始Excel

配置规则XML

转换成初步XML

代码非常的简单,还是第一个版本,以后再慢慢优化和重构

class Program
{
static async Task Main(string[] args)
{
var path = @"d:\9C箱单0000880680.xlsx";
var configpath = @"d:\XslImportRule1.xml";
var xdoc = XDocument.Load(configpath);
var root = xdoc.Root.Name;
var descxml = new XDocument();
descxml.Add(new XElement(xdoc.Root.Name));
var workbook = new XSSFWorkbook(path);
Process(workbook,null, xdoc.Root, , descxml.Root,null);
descxml.Save("d:\\output.xml");
return;
}
static void Process(IWorkbook book, ISheet sheet, XElement element, int depth, XElement root,DataRow dr)
{
var pelment = element.Parent;
var name = element.Name;
var atts = element.Attributes();
var replicate = atts.Where(x => x.Name == "replicate").FirstOrDefault()?.Value;
var sheetname = atts.Where(x => x.Name == "sheet-name").FirstOrDefault()?.Value;
var starttag = atts.Where(x => x.Name == "start-tag").FirstOrDefault()?.Value;
var start = atts.Where(x => x.Name == "start").FirstOrDefault()?.Value;
var endtag = atts.Where(x => x.Name == "end-tag").FirstOrDefault()?.Value;
var end = atts.Where(x => x.Name == "end").FirstOrDefault()?.Value;
var fieldname = atts.Where(x => x.Name == "data-field").Select(x => x.Value).FirstOrDefault();
var datatype = atts.Where(x => x.Name == "data-type").Select(x => x.Value);
var defaultvalue = atts.Where(x => x.Name == "data-default").FirstOrDefault()?.Value;
var formatter = atts.Where(x => x.Name == "data-formatter").FirstOrDefault()?.Value;
var offset = atts.Where(x => x.Name == "data-offset").FirstOrDefault()?.Value;
XElement copyelement = null;

         //if (element.Parent != null )  
         //{  
         //    copyelement = new XElement(name);  
         //    root.Add(copyelement);  
         //}  
         if (!string.IsNullOrEmpty(replicate) && !string.IsNullOrEmpty(sheetname)) {  
             sheet = book.GetSheet(sheetname);  
         }

         if (!element.HasElements)  
         {  
             copyelement = new XElement(name);  
             root.Add(copyelement);  
             // element is child with no descendants  
             if (dr == null)  
             {  
                 CellAddress celladdress = null;  
                 if (!string.IsNullOrEmpty(starttag))  
                 {  
                     celladdress = findXslx(sheet, starttag);  
                 }  
                 else if (!string.IsNullOrEmpty(start))  
                 {  
                     celladdress = new CellAddress(new CellReference(start));  
                 }  
                 if (celladdress != null)  
                 {  
                     var r = ;  
                     var c = ;  
                     if (!string.IsNullOrEmpty(offset))  
                     {  
                         var sp = offset.Split(';');  
                         foreach (var ts in sp)  
                         {  
                             var sparray = ts.Split(':');  
                             if (sparray\[\].Equals("c", StringComparison.OrdinalIgnoreCase))  
                             {  
                                 c = Convert.ToInt32(sparray\[\]);  
                             }  
                             else  
                             {  
                                 r = Convert.ToInt32(sparray\[\]);  
                             }  
                         }  
                     }  
                     var cell = sheet.GetRow(celladdress.Row + r).GetCell(celladdress.Column + c);  
                     var val = getCellValue(cell);  
                     if (string.IsNullOrEmpty(val) && !string.IsNullOrEmpty(defaultvalue))  
                     {  
                         val = defaultvalue;  
                     }  
                     if (!string.IsNullOrEmpty(val) && !string.IsNullOrEmpty(formatter))  
                     {  
                         var codescript = formatter.Replace("$", "\\"" + val + "\\"");  
                         var fval = CSharpScript.EvaluateAsync<string>(codescript).Result;  
                         val = fval;  
                     }  
                     copyelement.SetValue(val);  
                 }  
                 else if (!string.IsNullOrEmpty(defaultvalue))  
                 {  
                     copyelement.SetValue(defaultvalue);  
                 }  
             }  
             else  
             {  
                if(dr.Table.Columns.Contains(fieldname))  
                 {  
                     var val =  dr\[fieldname\].ToString();  
                     if (string.IsNullOrEmpty(val) && !string.IsNullOrEmpty(defaultvalue))  
                     {  
                         val = defaultvalue;

                     }  
                     copyelement.SetValue(val);  
                 }  
                 else if(!string.IsNullOrEmpty(defaultvalue))  
                 {  
                    copyelement.SetValue(defaultvalue);  
                 }  
             }

         }  
         else  
         {  
             depth++;  
             if (replicate == "true")  
             {  
                 var datatable= filldatatable(sheet, starttag, start, endtag, end, offset);  
                 if (datatable.Rows.Count > )  
                 {  
                     foreach (DataRow datarow in datatable.Rows)  
                     {  
                         copyelement = new XElement(name);  
                         foreach (var child in element.Elements())  
                         {  
                             if (copyelement != null)  
                             {  
                                 Process(book, sheet, child, depth, copyelement, datarow);  
                             }  
                             else  
                             {  
                                 Process(book, sheet, child, depth, root, datarow);  
                             }

                         }  
                         root.Add(copyelement);  
                     }  
                 }  
             }  
             else  
             {  
                 if (element.Parent != null)  
                 {  
                     copyelement = new XElement(name);  
                     root.Add(copyelement);  
                 }  
                 foreach (var child in element.Elements())  
                 {  
                     if (copyelement != null)  
                     {  
                         Process(book,sheet, child, depth, copyelement,null);  
                     }  
                     else  
                     {  
                         Process(book,sheet, child, depth, root,null);  
                     }

                 }  
             }

             depth--;  
         }  
     }

     private static DataTable filldatatable(ISheet sheet, string starttag, string start, string endtag, string end, string offset)  
     {  
         CellAddress startaddress = null;  
         CellAddress endaddress = null;  
         if (!string.IsNullOrEmpty(starttag))  
         {  
             startaddress = findXslx(sheet, starttag);  
         }  
         else if (!string.IsNullOrEmpty(start))  
         {  
             startaddress = new CellAddress(new CellReference(start));  
         }  
         else  
         {  
             startaddress = new CellAddress(new CellReference("A0"));  
         }  
         if (!string.IsNullOrEmpty(endtag))  
         {  
             endaddress = findXslx(sheet, endtag);  
         }  
         else if (!string.IsNullOrEmpty(end))  
         {  
             endaddress = new CellAddress(new CellReference(end));  
         }  
         else  
         {  
             endaddress = null;  
         }  
         var offsetr = ;  
         var offsetc = ;  
         if (!string.IsNullOrEmpty(offset))  
         {  
             var sp = offset.Split(';');  
             foreach (var ts in sp)  
             {  
                 var sparray = ts.Split(':');  
                 if (sparray\[\].Equals("c", StringComparison.OrdinalIgnoreCase))  
                 {  
                     offsetc = Convert.ToInt32(sparray\[\]);  
                 }  
                 else  
                 {  
                     offsetr = Convert.ToInt32(sparray\[\]);  
                 }  
             }  
         }  
         var firstrow = startaddress == null ? sheet.FirstRowNum : startaddress.Row + offsetr;  
         var lastrow = (endaddress == null) ? sheet.LastRowNum : endaddress.Row;  
         var table = new DataTable();  
         var lastcell = ; //row.LastCellNum;  
         var firstcell = ; //row.FirstCellNum + offsetc;  
         for (int r = firstrow; r < lastrow; r++)  
         {  
             var row = sheet.GetRow(r);  
             if (row == null) continue;

             if (r == firstrow)  
             {  
                  lastcell =  row.LastCellNum;  
                  firstcell = row.FirstCellNum + offsetc;  
                 for (int c = firstcell; c < lastcell; c++)  
                 {  
                     var cell = row.GetCell(c);  
                     if (cell == null) continue;  
                     var strval = getCellValue(cell).Trim();  
                     if (!string.IsNullOrEmpty(strval))  
                     {  
                         table.Columns.Add(new DataColumn(strval));  
                     }  
                 }  
             }  
             else  
             {  
                 var dataRow = table.NewRow();  
                 var array = new string\[table.Columns.Count\];  
                 //for (var c = 0; c < table.Columns.Count; c++)  
                 //{  
                 //    var cell = row.GetCell(firstcell+c);  
                 //    var val = getCellValue(cell).Trim();  
                 //    array\[c\] = val;  
                 //}  
                 for (int c = firstcell; c < lastcell; c++)  
                 {  
                     var cell = row.GetCell(c);  
                     var val = getCellValue(cell).Trim();  
                     array\[c- firstcell\] = val;  
                 }  
                 dataRow.ItemArray = array;  
                 table.Rows.Add(dataRow);  
             }  
         }  
         return table;  
     }

     private static CellAddress findXslx(ISheet sheet, string key)  
     {  
         var lastrow = sheet.LastRowNum;  
         var firstrow = sheet.FirstRowNum;  
         for (int r = firstrow; r < lastrow; r++)  
         {  
             var row = sheet.GetRow(r);  
             if (row == null) continue;  
             var lastcell = row.LastCellNum;  
             var firstcell = row.FirstCellNum;  
             for (int c = firstcell; c < lastcell; c++)  
             {  
                 var cell = row.GetCell(c);  
                 if (cell == null) continue;  
                 var strval = getCellValue(cell).Trim();  
                 //if (strval.Trim().Equals(key, StringComparison.OrdinalIgnoreCase))  
                 //{  
                 //    return cell.Address;  
                 //}  
                 if (match(key, strval))  
                 {  
                     return cell.Address;  
                 }  
             }  
         }  
         return null;  
     }  
     private static string getCellValue(ICell cell)  
     {  
         if (cell == null)  
         {  
             return string.Empty;  
         }  
         var dataFormatter = new DataFormatter(CultureInfo.CurrentCulture);

         // If this is not part of a merge cell,  
         // just get this cell's value like normal.  
         if (!cell.IsMergedCell)  
         {  
             return dataFormatter.FormatCellValue(cell);  
         }

         // Otherwise, we need to find the value of this merged cell.  
         else  
         {  
             // Get current sheet.  
             var currentSheet = cell.Sheet;

             // Loop through all merge regions in this sheet.  
             for (int i = ; i < currentSheet.NumMergedRegions; i++)  
             {  
                 var mergeRegion = currentSheet.GetMergedRegion(i);

                 // If this merged region contains this cell.  
                 if (mergeRegion.FirstRow <= cell.RowIndex && cell.RowIndex <= mergeRegion.LastRow &&  
                     mergeRegion.FirstColumn <= cell.ColumnIndex && cell.ColumnIndex <= mergeRegion.LastColumn)  
                 {  
                     // Find the top-most and left-most cell in this region.  
                     var firstRegionCell = currentSheet.GetRow(mergeRegion.FirstRow)  
                                             .GetCell(mergeRegion.FirstColumn);

                     // And return its value.  
                     return dataFormatter.FormatCellValue(firstRegionCell);  
                 }  
             }  
             // This should never happen.  
             throw new Exception("Cannot find this cell in any merged region");  
         }  
     }

     static bool match(string pattern, string input)  
     {  
         if (String.Compare(pattern, input) == )  
         {  
             return true;  
         }  
         else if (String.IsNullOrEmpty(input))  
         {  
             if (String.IsNullOrEmpty(pattern.Trim(new Char\[\] { '\*' })))  
             {  
                 return true;  
             }  
             else  
             {  
                 return false;  
             }  
         }  
         else if (pattern.Length == )  
         {  
             return false;  
         }  
         else if (pattern\[\] == '?')  
         {  
             return match(pattern.Substring(), input.Substring());  
         }  
         else if (pattern\[pattern.Length - \] == '?')  
         {  
             return match(pattern.Substring(, pattern.Length - ),  
                                        input.Substring(, input.Length - ));  
         }  
         else if (pattern\[\] == '\*')  
         {  
             if (match(pattern.Substring(), input))  
             {  
                 return true;  
             }  
             else  
             {  
                 return match(pattern, input.Substring());  
             }  
         }  
         else if (pattern\[pattern.Length - \] == '\*')  
         {  
             if (match(pattern.Substring(, pattern.Length - ), input))  
             {  
                 return true;  
             }  
             else  
             {  
                 return match(pattern, input.Substring(, input.Length - ));  
             }  
         }  
         else if (pattern\[\] == input\[\])  
         {  
             return match(pattern.Substring(), input.Substring());  
         }  
         return false;  
     }  
 }  

}

代码库

https://github.com/neozhu/excelcompleximport

最近还会继续更新

手机扫一扫

移动阅读更方便

阿里云服务器
腾讯云服务器
七牛云服务器