<!--- SPDX-License-Identifier: Apache-2.0 -->
## Operator Schemas
*This file is automatically generated from the
            [def files](/onnx/defs) via [this script](/onnx/defs/gen_doc.py).
            Do not modify directly and instead edit operator definitions.*

For an operator input/output's differentiability, it can be differentiable,
            non-differentiable, or undefined. If a variable's differentiability
            is not specified, that variable has undefined differentiability.

### ai.onnx (default)
|**Operator**|**Since version**||
|-|-|-|
|<a href="#Abs">Abs</a>|<a href="Changelog.md#Abs-13">13</a>, <a href="Changelog.md#Abs-6">6</a>, <a href="Changelog.md#Abs-1">1</a>|
|<a href="#Acos">Acos</a>|<a href="Changelog.md#Acos-7">7</a>|
|<a href="#Acosh">Acosh</a>|<a href="Changelog.md#Acosh-9">9</a>|
|<a href="#Add">Add</a>|<a href="Changelog.md#Add-14">14</a>, <a href="Changelog.md#Add-13">13</a>, <a href="Changelog.md#Add-7">7</a>, <a href="Changelog.md#Add-6">6</a>, <a href="Changelog.md#Add-1">1</a>|
|<a href="#And">And</a>|<a href="Changelog.md#And-7">7</a>, <a href="Changelog.md#And-1">1</a>|
|<a href="#ArgMax">ArgMax</a>|<a href="Changelog.md#ArgMax-13">13</a>, <a href="Changelog.md#ArgMax-12">12</a>, <a href="Changelog.md#ArgMax-11">11</a>, <a href="Changelog.md#ArgMax-1">1</a>|
|<a href="#ArgMin">ArgMin</a>|<a href="Changelog.md#ArgMin-13">13</a>, <a href="Changelog.md#ArgMin-12">12</a>, <a href="Changelog.md#ArgMin-11">11</a>, <a href="Changelog.md#ArgMin-1">1</a>|
|<a href="#Asin">Asin</a>|<a href="Changelog.md#Asin-7">7</a>|
|<a href="#Asinh">Asinh</a>|<a href="Changelog.md#Asinh-9">9</a>|
|<a href="#Atan">Atan</a>|<a href="Changelog.md#Atan-7">7</a>|
|<a href="#Atanh">Atanh</a>|<a href="Changelog.md#Atanh-9">9</a>|
|<a href="#AveragePool">AveragePool</a>|<a href="Changelog.md#AveragePool-19">19</a>, <a href="Changelog.md#AveragePool-11">11</a>, <a href="Changelog.md#AveragePool-10">10</a>, <a href="Changelog.md#AveragePool-7">7</a>, <a href="Changelog.md#AveragePool-1">1</a>|
|<a href="#BatchNormalization">BatchNormalization</a>|<a href="Changelog.md#BatchNormalization-15">15</a>, <a href="Changelog.md#BatchNormalization-14">14</a>, <a href="Changelog.md#BatchNormalization-9">9</a>, <a href="Changelog.md#BatchNormalization-7">7</a>, <a href="Changelog.md#BatchNormalization-6">6</a>, <a href="Changelog.md#BatchNormalization-1">1</a>|
|<a href="#BitShift">BitShift</a>|<a href="Changelog.md#BitShift-11">11</a>|
|<a href="#BitwiseAnd">BitwiseAnd</a>|<a href="Changelog.md#BitwiseAnd-18">18</a>|
|<a href="#BitwiseNot">BitwiseNot</a>|<a href="Changelog.md#BitwiseNot-18">18</a>|
|<a href="#BitwiseOr">BitwiseOr</a>|<a href="Changelog.md#BitwiseOr-18">18</a>|
|<a href="#BitwiseXor">BitwiseXor</a>|<a href="Changelog.md#BitwiseXor-18">18</a>|
|<a href="#Cast">Cast</a>|<a href="Changelog.md#Cast-21">21</a>, <a href="Changelog.md#Cast-19">19</a>, <a href="Changelog.md#Cast-13">13</a>, <a href="Changelog.md#Cast-9">9</a>, <a href="Changelog.md#Cast-6">6</a>, <a href="Changelog.md#Cast-1">1</a>|
|<a href="#Ceil">Ceil</a>|<a href="Changelog.md#Ceil-13">13</a>, <a href="Changelog.md#Ceil-6">6</a>, <a href="Changelog.md#Ceil-1">1</a>|
|<a href="#Col2Im">Col2Im</a>|<a href="Changelog.md#Col2Im-18">18</a>|
|<a href="#Compress">Compress</a>|<a href="Changelog.md#Compress-11">11</a>, <a href="Changelog.md#Compress-9">9</a>|
|<a href="#Concat">Concat</a>|<a href="Changelog.md#Concat-13">13</a>, <a href="Changelog.md#Concat-11">11</a>, <a href="Changelog.md#Concat-4">4</a>, <a href="Changelog.md#Concat-1">1</a>|
|<a href="#ConcatFromSequence">ConcatFromSequence</a>|<a href="Changelog.md#ConcatFromSequence-11">11</a>|
|<a href="#Constant">Constant</a>|<a href="Changelog.md#Constant-21">21</a>, <a href="Changelog.md#Constant-19">19</a>, <a href="Changelog.md#Constant-13">13</a>, <a href="Changelog.md#Constant-12">12</a>, <a href="Changelog.md#Constant-11">11</a>, <a href="Changelog.md#Constant-9">9</a>, <a href="Changelog.md#Constant-1">1</a>|
|<a href="#ConstantOfShape">ConstantOfShape</a>|<a href="Changelog.md#ConstantOfShape-21">21</a>, <a href="Changelog.md#ConstantOfShape-20">20</a>, <a href="Changelog.md#ConstantOfShape-9">9</a>|
|<a href="#Conv">Conv</a>|<a href="Changelog.md#Conv-11">11</a>, <a href="Changelog.md#Conv-1">1</a>|
|<a href="#ConvInteger">ConvInteger</a>|<a href="Changelog.md#ConvInteger-10">10</a>|
|<a href="#ConvTranspose">ConvTranspose</a>|<a href="Changelog.md#ConvTranspose-11">11</a>, <a href="Changelog.md#ConvTranspose-1">1</a>|
|<a href="#Cos">Cos</a>|<a href="Changelog.md#Cos-7">7</a>|
|<a href="#Cosh">Cosh</a>|<a href="Changelog.md#Cosh-9">9</a>|
|<a href="#CumSum">CumSum</a>|<a href="Changelog.md#CumSum-14">14</a>, <a href="Changelog.md#CumSum-11">11</a>|
|<a href="#DFT">DFT</a>|<a href="Changelog.md#DFT-20">20</a>, <a href="Changelog.md#DFT-17">17</a>|
|<a href="#DeformConv">DeformConv</a>|<a href="Changelog.md#DeformConv-19">19</a>|
|<a href="#DepthToSpace">DepthToSpace</a>|<a href="Changelog.md#DepthToSpace-13">13</a>, <a href="Changelog.md#DepthToSpace-11">11</a>, <a href="Changelog.md#DepthToSpace-1">1</a>|
|<a href="#DequantizeLinear">DequantizeLinear</a>|<a href="Changelog.md#DequantizeLinear-21">21</a>, <a href="Changelog.md#DequantizeLinear-19">19</a>, <a href="Changelog.md#DequantizeLinear-13">13</a>, <a href="Changelog.md#DequantizeLinear-10">10</a>|
|<a href="#Det">Det</a>|<a href="Changelog.md#Det-11">11</a>|
|<a href="#Div">Div</a>|<a href="Changelog.md#Div-14">14</a>, <a href="Changelog.md#Div-13">13</a>, <a href="Changelog.md#Div-7">7</a>, <a href="Changelog.md#Div-6">6</a>, <a href="Changelog.md#Div-1">1</a>|
|<a href="#Dropout">Dropout</a>|<a href="Changelog.md#Dropout-13">13</a>, <a href="Changelog.md#Dropout-12">12</a>, <a href="Changelog.md#Dropout-10">10</a>, <a href="Changelog.md#Dropout-7">7</a>, <a href="Changelog.md#Dropout-6">6</a>, <a href="Changelog.md#Dropout-1">1</a>|
|<a href="#Einsum">Einsum</a>|<a href="Changelog.md#Einsum-12">12</a>|
|<a href="#Equal">Equal</a>|<a href="Changelog.md#Equal-19">19</a>, <a href="Changelog.md#Equal-13">13</a>, <a href="Changelog.md#Equal-11">11</a>, <a href="Changelog.md#Equal-7">7</a>, <a href="Changelog.md#Equal-1">1</a>|
|<a href="#Erf">Erf</a>|<a href="Changelog.md#Erf-13">13</a>, <a href="Changelog.md#Erf-9">9</a>|
|<a href="#Exp">Exp</a>|<a href="Changelog.md#Exp-13">13</a>, <a href="Changelog.md#Exp-6">6</a>, <a href="Changelog.md#Exp-1">1</a>|
|<a href="#Expand">Expand</a>|<a href="Changelog.md#Expand-13">13</a>, <a href="Changelog.md#Expand-8">8</a>|
|<a href="#EyeLike">EyeLike</a>|<a href="Changelog.md#EyeLike-9">9</a>|
|<a href="#Flatten">Flatten</a>|<a href="Changelog.md#Flatten-21">21</a>, <a href="Changelog.md#Flatten-13">13</a>, <a href="Changelog.md#Flatten-11">11</a>, <a href="Changelog.md#Flatten-9">9</a>, <a href="Changelog.md#Flatten-1">1</a>|
|<a href="#Floor">Floor</a>|<a href="Changelog.md#Floor-13">13</a>, <a href="Changelog.md#Floor-6">6</a>, <a href="Changelog.md#Floor-1">1</a>|
|<a href="#GRU">GRU</a>|<a href="Changelog.md#GRU-14">14</a>, <a href="Changelog.md#GRU-7">7</a>, <a href="Changelog.md#GRU-3">3</a>, <a href="Changelog.md#GRU-1">1</a>|
|<a href="#Gather">Gather</a>|<a href="Changelog.md#Gather-13">13</a>, <a href="Changelog.md#Gather-11">11</a>, <a href="Changelog.md#Gather-1">1</a>|
|<a href="#GatherElements">GatherElements</a>|<a href="Changelog.md#GatherElements-13">13</a>, <a href="Changelog.md#GatherElements-11">11</a>|
|<a href="#GatherND">GatherND</a>|<a href="Changelog.md#GatherND-13">13</a>, <a href="Changelog.md#GatherND-12">12</a>, <a href="Changelog.md#GatherND-11">11</a>|
|<a href="#Gemm">Gemm</a>|<a href="Changelog.md#Gemm-13">13</a>, <a href="Changelog.md#Gemm-11">11</a>, <a href="Changelog.md#Gemm-9">9</a>, <a href="Changelog.md#Gemm-7">7</a>, <a href="Changelog.md#Gemm-6">6</a>, <a href="Changelog.md#Gemm-1">1</a>|
|<a href="#GlobalAveragePool">GlobalAveragePool</a>|<a href="Changelog.md#GlobalAveragePool-1">1</a>|
|<a href="#GlobalLpPool">GlobalLpPool</a>|<a href="Changelog.md#GlobalLpPool-2">2</a>, <a href="Changelog.md#GlobalLpPool-1">1</a>|
|<a href="#GlobalMaxPool">GlobalMaxPool</a>|<a href="Changelog.md#GlobalMaxPool-1">1</a>|
|<a href="#Greater">Greater</a>|<a href="Changelog.md#Greater-13">13</a>, <a href="Changelog.md#Greater-9">9</a>, <a href="Changelog.md#Greater-7">7</a>, <a href="Changelog.md#Greater-1">1</a>|
|<a href="#GridSample">GridSample</a>|<a href="Changelog.md#GridSample-20">20</a>, <a href="Changelog.md#GridSample-16">16</a>|
|<a href="#Hardmax">Hardmax</a>|<a href="Changelog.md#Hardmax-13">13</a>, <a href="Changelog.md#Hardmax-11">11</a>, <a href="Changelog.md#Hardmax-1">1</a>|
|<a href="#Identity">Identity</a>|<a href="Changelog.md#Identity-21">21</a>, <a href="Changelog.md#Identity-19">19</a>, <a href="Changelog.md#Identity-16">16</a>, <a href="Changelog.md#Identity-14">14</a>, <a href="Changelog.md#Identity-13">13</a>, <a href="Changelog.md#Identity-1">1</a>|
|<a href="#If">If</a>|<a href="Changelog.md#If-21">21</a>, <a href="Changelog.md#If-19">19</a>, <a href="Changelog.md#If-16">16</a>, <a href="Changelog.md#If-13">13</a>, <a href="Changelog.md#If-11">11</a>, <a href="Changelog.md#If-1">1</a>|
|<a href="#ImageDecoder">ImageDecoder</a>|<a href="Changelog.md#ImageDecoder-20">20</a>|
|<a href="#InstanceNormalization">InstanceNormalization</a>|<a href="Changelog.md#InstanceNormalization-6">6</a>, <a href="Changelog.md#InstanceNormalization-1">1</a>|
|<a href="#IsInf">IsInf</a>|<a href="Changelog.md#IsInf-20">20</a>, <a href="Changelog.md#IsInf-10">10</a>|
|<a href="#IsNaN">IsNaN</a>|<a href="Changelog.md#IsNaN-20">20</a>, <a href="Changelog.md#IsNaN-13">13</a>, <a href="Changelog.md#IsNaN-9">9</a>|
|<a href="#LRN">LRN</a>|<a href="Changelog.md#LRN-13">13</a>, <a href="Changelog.md#LRN-1">1</a>|
|<a href="#LSTM">LSTM</a>|<a href="Changelog.md#LSTM-14">14</a>, <a href="Changelog.md#LSTM-7">7</a>, <a href="Changelog.md#LSTM-1">1</a>|
|<a href="#Less">Less</a>|<a href="Changelog.md#Less-13">13</a>, <a href="Changelog.md#Less-9">9</a>, <a href="Changelog.md#Less-7">7</a>, <a href="Changelog.md#Less-1">1</a>|
|<a href="#Log">Log</a>|<a href="Changelog.md#Log-13">13</a>, <a href="Changelog.md#Log-6">6</a>, <a href="Changelog.md#Log-1">1</a>|
|<a href="#Loop">Loop</a>|<a href="Changelog.md#Loop-21">21</a>, <a href="Changelog.md#Loop-19">19</a>, <a href="Changelog.md#Loop-16">16</a>, <a href="Changelog.md#Loop-13">13</a>, <a href="Changelog.md#Loop-11">11</a>, <a href="Changelog.md#Loop-1">1</a>|
|<a href="#LpNormalization">LpNormalization</a>|<a href="Changelog.md#LpNormalization-1">1</a>|
|<a href="#LpPool">LpPool</a>|<a href="Changelog.md#LpPool-18">18</a>, <a href="Changelog.md#LpPool-11">11</a>, <a href="Changelog.md#LpPool-2">2</a>, <a href="Changelog.md#LpPool-1">1</a>|
|<a href="#MatMul">MatMul</a>|<a href="Changelog.md#MatMul-13">13</a>, <a href="Changelog.md#MatMul-9">9</a>, <a href="Changelog.md#MatMul-1">1</a>|
|<a href="#MatMulInteger">MatMulInteger</a>|<a href="Changelog.md#MatMulInteger-10">10</a>|
|<a href="#Max">Max</a>|<a href="Changelog.md#Max-13">13</a>, <a href="Changelog.md#Max-12">12</a>, <a href="Changelog.md#Max-8">8</a>, <a href="Changelog.md#Max-6">6</a>, <a href="Changelog.md#Max-1">1</a>|
|<a href="#MaxPool">MaxPool</a>|<a href="Changelog.md#MaxPool-12">12</a>, <a href="Changelog.md#MaxPool-11">11</a>, <a href="Changelog.md#MaxPool-10">10</a>, <a href="Changelog.md#MaxPool-8">8</a>, <a href="Changelog.md#MaxPool-1">1</a>|
|<a href="#MaxRoiPool">MaxRoiPool</a>|<a href="Changelog.md#MaxRoiPool-1">1</a>|
|<a href="#MaxUnpool">MaxUnpool</a>|<a href="Changelog.md#MaxUnpool-11">11</a>, <a href="Changelog.md#MaxUnpool-9">9</a>|
|<a href="#Mean">Mean</a>|<a href="Changelog.md#Mean-13">13</a>, <a href="Changelog.md#Mean-8">8</a>, <a href="Changelog.md#Mean-6">6</a>, <a href="Changelog.md#Mean-1">1</a>|
|<a href="#MelWeightMatrix">MelWeightMatrix</a>|<a href="Changelog.md#MelWeightMatrix-17">17</a>|
|<a href="#Min">Min</a>|<a href="Changelog.md#Min-13">13</a>, <a href="Changelog.md#Min-12">12</a>, <a href="Changelog.md#Min-8">8</a>, <a href="Changelog.md#Min-6">6</a>, <a href="Changelog.md#Min-1">1</a>|
|<a href="#Mod">Mod</a>|<a href="Changelog.md#Mod-13">13</a>, <a href="Changelog.md#Mod-10">10</a>|
|<a href="#Mul">Mul</a>|<a href="Changelog.md#Mul-14">14</a>, <a href="Changelog.md#Mul-13">13</a>, <a href="Changelog.md#Mul-7">7</a>, <a href="Changelog.md#Mul-6">6</a>, <a href="Changelog.md#Mul-1">1</a>|
|<a href="#Multinomial">Multinomial</a>|<a href="Changelog.md#Multinomial-7">7</a>|
|<a href="#Neg">Neg</a>|<a href="Changelog.md#Neg-13">13</a>, <a href="Changelog.md#Neg-6">6</a>, <a href="Changelog.md#Neg-1">1</a>|
|<a href="#NonMaxSuppression">NonMaxSuppression</a>|<a href="Changelog.md#NonMaxSuppression-11">11</a>, <a href="Changelog.md#NonMaxSuppression-10">10</a>|
|<a href="#NonZero">NonZero</a>|<a href="Changelog.md#NonZero-13">13</a>, <a href="Changelog.md#NonZero-9">9</a>|
|<a href="#Not">Not</a>|<a href="Changelog.md#Not-1">1</a>|
|<a href="#OneHot">OneHot</a>|<a href="Changelog.md#OneHot-11">11</a>, <a href="Changelog.md#OneHot-9">9</a>|
|<a href="#Optional">Optional</a>|<a href="Changelog.md#Optional-15">15</a>|
|<a href="#OptionalGetElement">OptionalGetElement</a>|<a href="Changelog.md#OptionalGetElement-18">18</a>, <a href="Changelog.md#OptionalGetElement-15">15</a>|
|<a href="#OptionalHasElement">OptionalHasElement</a>|<a href="Changelog.md#OptionalHasElement-18">18</a>, <a href="Changelog.md#OptionalHasElement-15">15</a>|
|<a href="#Or">Or</a>|<a href="Changelog.md#Or-7">7</a>, <a href="Changelog.md#Or-1">1</a>|
|<a href="#Pad">Pad</a>|<a href="Changelog.md#Pad-21">21</a>, <a href="Changelog.md#Pad-19">19</a>, <a href="Changelog.md#Pad-18">18</a>, <a href="Changelog.md#Pad-13">13</a>, <a href="Changelog.md#Pad-11">11</a>, <a href="Changelog.md#Pad-2">2</a>, <a href="Changelog.md#Pad-1">1</a>|
|<a href="#Pow">Pow</a>|<a href="Changelog.md#Pow-15">15</a>, <a href="Changelog.md#Pow-13">13</a>, <a href="Changelog.md#Pow-12">12</a>, <a href="Changelog.md#Pow-7">7</a>, <a href="Changelog.md#Pow-1">1</a>|
|<a href="#QLinearConv">QLinearConv</a>|<a href="Changelog.md#QLinearConv-10">10</a>|
|<a href="#QLinearMatMul">QLinearMatMul</a>|<a href="Changelog.md#QLinearMatMul-21">21</a>, <a href="Changelog.md#QLinearMatMul-10">10</a>|
|<a href="#QuantizeLinear">QuantizeLinear</a>|<a href="Changelog.md#QuantizeLinear-21">21</a>, <a href="Changelog.md#QuantizeLinear-19">19</a>, <a href="Changelog.md#QuantizeLinear-13">13</a>, <a href="Changelog.md#QuantizeLinear-10">10</a>|
|<a href="#RNN">RNN</a>|<a href="Changelog.md#RNN-14">14</a>, <a href="Changelog.md#RNN-7">7</a>, <a href="Changelog.md#RNN-1">1</a>|
|<a href="#RandomNormal">RandomNormal</a>|<a href="Changelog.md#RandomNormal-1">1</a>|
|<a href="#RandomNormalLike">RandomNormalLike</a>|<a href="Changelog.md#RandomNormalLike-1">1</a>|
|<a href="#RandomUniform">RandomUniform</a>|<a href="Changelog.md#RandomUniform-1">1</a>|
|<a href="#RandomUniformLike">RandomUniformLike</a>|<a href="Changelog.md#RandomUniformLike-1">1</a>|
|<a href="#Reciprocal">Reciprocal</a>|<a href="Changelog.md#Reciprocal-13">13</a>, <a href="Changelog.md#Reciprocal-6">6</a>, <a href="Changelog.md#Reciprocal-1">1</a>|
|<a href="#ReduceMax">ReduceMax</a>|<a href="Changelog.md#ReduceMax-20">20</a>, <a href="Changelog.md#ReduceMax-18">18</a>, <a href="Changelog.md#ReduceMax-13">13</a>, <a href="Changelog.md#ReduceMax-12">12</a>, <a href="Changelog.md#ReduceMax-11">11</a>, <a href="Changelog.md#ReduceMax-1">1</a>|
|<a href="#ReduceMean">ReduceMean</a>|<a href="Changelog.md#ReduceMean-18">18</a>, <a href="Changelog.md#ReduceMean-13">13</a>, <a href="Changelog.md#ReduceMean-11">11</a>, <a href="Changelog.md#ReduceMean-1">1</a>|
|<a href="#ReduceMin">ReduceMin</a>|<a href="Changelog.md#ReduceMin-20">20</a>, <a href="Changelog.md#ReduceMin-18">18</a>, <a href="Changelog.md#ReduceMin-13">13</a>, <a href="Changelog.md#ReduceMin-12">12</a>, <a href="Changelog.md#ReduceMin-11">11</a>, <a href="Changelog.md#ReduceMin-1">1</a>|
|<a href="#ReduceProd">ReduceProd</a>|<a href="Changelog.md#ReduceProd-18">18</a>, <a href="Changelog.md#ReduceProd-13">13</a>, <a href="Changelog.md#ReduceProd-11">11</a>, <a href="Changelog.md#ReduceProd-1">1</a>|
|<a href="#ReduceSum">ReduceSum</a>|<a href="Changelog.md#ReduceSum-13">13</a>, <a href="Changelog.md#ReduceSum-11">11</a>, <a href="Changelog.md#ReduceSum-1">1</a>|
|<a href="#RegexFullMatch">RegexFullMatch</a>|<a href="Changelog.md#RegexFullMatch-20">20</a>|
|<a href="#Reshape">Reshape</a>|<a href="Changelog.md#Reshape-21">21</a>, <a href="Changelog.md#Reshape-19">19</a>, <a href="Changelog.md#Reshape-14">14</a>, <a href="Changelog.md#Reshape-13">13</a>, <a href="Changelog.md#Reshape-5">5</a>, <a href="Changelog.md#Reshape-1">1</a>|
|<a href="#Resize">Resize</a>|<a href="Changelog.md#Resize-19">19</a>, <a href="Changelog.md#Resize-18">18</a>, <a href="Changelog.md#Resize-13">13</a>, <a href="Changelog.md#Resize-11">11</a>, <a href="Changelog.md#Resize-10">10</a>|
|<a href="#ReverseSequence">ReverseSequence</a>|<a href="Changelog.md#ReverseSequence-10">10</a>|
|<a href="#RoiAlign">RoiAlign</a>|<a href="Changelog.md#RoiAlign-16">16</a>, <a href="Changelog.md#RoiAlign-10">10</a>|
|<a href="#Round">Round</a>|<a href="Changelog.md#Round-11">11</a>|
|<a href="#STFT">STFT</a>|<a href="Changelog.md#STFT-17">17</a>|
|<a href="#Scan">Scan</a>|<a href="Changelog.md#Scan-21">21</a>, <a href="Changelog.md#Scan-19">19</a>, <a href="Changelog.md#Scan-16">16</a>, <a href="Changelog.md#Scan-11">11</a>, <a href="Changelog.md#Scan-9">9</a>, <a href="Changelog.md#Scan-8">8</a>|
|<a href="#Scatter">Scatter</a> (deprecated)|<a href="Changelog.md#Scatter-11">11</a>, <a href="Changelog.md#Scatter-9">9</a>|
|<a href="#ScatterElements">ScatterElements</a>|<a href="Changelog.md#ScatterElements-18">18</a>, <a href="Changelog.md#ScatterElements-16">16</a>, <a href="Changelog.md#ScatterElements-13">13</a>, <a href="Changelog.md#ScatterElements-11">11</a>|
|<a href="#ScatterND">ScatterND</a>|<a href="Changelog.md#ScatterND-18">18</a>, <a href="Changelog.md#ScatterND-16">16</a>, <a href="Changelog.md#ScatterND-13">13</a>, <a href="Changelog.md#ScatterND-11">11</a>|
|<a href="#SequenceAt">SequenceAt</a>|<a href="Changelog.md#SequenceAt-11">11</a>|
|<a href="#SequenceConstruct">SequenceConstruct</a>|<a href="Changelog.md#SequenceConstruct-11">11</a>|
|<a href="#SequenceEmpty">SequenceEmpty</a>|<a href="Changelog.md#SequenceEmpty-11">11</a>|
|<a href="#SequenceErase">SequenceErase</a>|<a href="Changelog.md#SequenceErase-11">11</a>|
|<a href="#SequenceInsert">SequenceInsert</a>|<a href="Changelog.md#SequenceInsert-11">11</a>|
|<a href="#SequenceLength">SequenceLength</a>|<a href="Changelog.md#SequenceLength-11">11</a>|
|<a href="#Shape">Shape</a>|<a href="Changelog.md#Shape-21">21</a>, <a href="Changelog.md#Shape-19">19</a>, <a href="Changelog.md#Shape-15">15</a>, <a href="Changelog.md#Shape-13">13</a>, <a href="Changelog.md#Shape-1">1</a>|
|<a href="#Sigmoid">Sigmoid</a>|<a href="Changelog.md#Sigmoid-13">13</a>, <a href="Changelog.md#Sigmoid-6">6</a>, <a href="Changelog.md#Sigmoid-1">1</a>|
|<a href="#Sign">Sign</a>|<a href="Changelog.md#Sign-13">13</a>, <a href="Changelog.md#Sign-9">9</a>|
|<a href="#Sin">Sin</a>|<a href="Changelog.md#Sin-7">7</a>|
|<a href="#Sinh">Sinh</a>|<a href="Changelog.md#Sinh-9">9</a>|
|<a href="#Size">Size</a>|<a href="Changelog.md#Size-21">21</a>, <a href="Changelog.md#Size-19">19</a>, <a href="Changelog.md#Size-13">13</a>, <a href="Changelog.md#Size-1">1</a>|
|<a href="#Slice">Slice</a>|<a href="Changelog.md#Slice-13">13</a>, <a href="Changelog.md#Slice-11">11</a>, <a href="Changelog.md#Slice-10">10</a>, <a href="Changelog.md#Slice-1">1</a>|
|<a href="#SpaceToDepth">SpaceToDepth</a>|<a href="Changelog.md#SpaceToDepth-13">13</a>, <a href="Changelog.md#SpaceToDepth-1">1</a>|
|<a href="#Split">Split</a>|<a href="Changelog.md#Split-18">18</a>, <a href="Changelog.md#Split-13">13</a>, <a href="Changelog.md#Split-11">11</a>, <a href="Changelog.md#Split-2">2</a>, <a href="Changelog.md#Split-1">1</a>|
|<a href="#SplitToSequence">SplitToSequence</a>|<a href="Changelog.md#SplitToSequence-11">11</a>|
|<a href="#Sqrt">Sqrt</a>|<a href="Changelog.md#Sqrt-13">13</a>, <a href="Changelog.md#Sqrt-6">6</a>, <a href="Changelog.md#Sqrt-1">1</a>|
|<a href="#Squeeze">Squeeze</a>|<a href="Changelog.md#Squeeze-21">21</a>, <a href="Changelog.md#Squeeze-13">13</a>, <a href="Changelog.md#Squeeze-11">11</a>, <a href="Changelog.md#Squeeze-1">1</a>|
|<a href="#StringConcat">StringConcat</a>|<a href="Changelog.md#StringConcat-20">20</a>|
|<a href="#StringNormalizer">StringNormalizer</a>|<a href="Changelog.md#StringNormalizer-10">10</a>|
|<a href="#StringSplit">StringSplit</a>|<a href="Changelog.md#StringSplit-20">20</a>|
|<a href="#Sub">Sub</a>|<a href="Changelog.md#Sub-14">14</a>, <a href="Changelog.md#Sub-13">13</a>, <a href="Changelog.md#Sub-7">7</a>, <a href="Changelog.md#Sub-6">6</a>, <a href="Changelog.md#Sub-1">1</a>|
|<a href="#Sum">Sum</a>|<a href="Changelog.md#Sum-13">13</a>, <a href="Changelog.md#Sum-8">8</a>, <a href="Changelog.md#Sum-6">6</a>, <a href="Changelog.md#Sum-1">1</a>|
|<a href="#Tan">Tan</a>|<a href="Changelog.md#Tan-7">7</a>|
|<a href="#Tanh">Tanh</a>|<a href="Changelog.md#Tanh-13">13</a>, <a href="Changelog.md#Tanh-6">6</a>, <a href="Changelog.md#Tanh-1">1</a>|
|<a href="#TfIdfVectorizer">TfIdfVectorizer</a>|<a href="Changelog.md#TfIdfVectorizer-9">9</a>|
|<a href="#Tile">Tile</a>|<a href="Changelog.md#Tile-13">13</a>, <a href="Changelog.md#Tile-6">6</a>, <a href="Changelog.md#Tile-1">1</a>|
|<a href="#TopK">TopK</a>|<a href="Changelog.md#TopK-11">11</a>, <a href="Changelog.md#TopK-10">10</a>, <a href="Changelog.md#TopK-1">1</a>|
|<a href="#Transpose">Transpose</a>|<a href="Changelog.md#Transpose-21">21</a>, <a href="Changelog.md#Transpose-13">13</a>, <a href="Changelog.md#Transpose-1">1</a>|
|<a href="#Trilu">Trilu</a>|<a href="Changelog.md#Trilu-14">14</a>|
|<a href="#Unique">Unique</a>|<a href="Changelog.md#Unique-11">11</a>|
|<a href="#Unsqueeze">Unsqueeze</a>|<a href="Changelog.md#Unsqueeze-21">21</a>, <a href="Changelog.md#Unsqueeze-13">13</a>, <a href="Changelog.md#Unsqueeze-11">11</a>, <a href="Changelog.md#Unsqueeze-1">1</a>|
|<a href="#Upsample">Upsample</a> (deprecated)|<a href="Changelog.md#Upsample-10">10</a>, <a href="Changelog.md#Upsample-9">9</a>, <a href="Changelog.md#Upsample-7">7</a>|
|<a href="#Where">Where</a>|<a href="Changelog.md#Where-16">16</a>, <a href="Changelog.md#Where-9">9</a>|
|<a href="#Xor">Xor</a>|<a href="Changelog.md#Xor-7">7</a>, <a href="Changelog.md#Xor-1">1</a>|
|**Function**|**Since version**|**Function version**|
|<a href="#AffineGrid">AffineGrid</a>|<a href="Changelog.md#AffineGrid-20">20</a>|20|
|<a href="#Bernoulli">Bernoulli</a>|<a href="Changelog.md#Bernoulli-15">15</a>|15|
|<a href="#BlackmanWindow">BlackmanWindow</a>|<a href="Changelog.md#BlackmanWindow-17">17</a>|17|
|<a href="#CastLike">CastLike</a>|<a href="Changelog.md#CastLike-21">21</a>, <a href="Changelog.md#CastLike-19">19</a>, <a href="Changelog.md#CastLike-15">15</a>|21|
|<a href="#Celu">Celu</a>|<a href="Changelog.md#Celu-12">12</a>|12|
|<a href="#CenterCropPad">CenterCropPad</a>|<a href="Changelog.md#CenterCropPad-18">18</a>|18|
|<a href="#Clip">Clip</a>|<a href="Changelog.md#Clip-13">13</a>, <a href="Changelog.md#Clip-12">12</a>, <a href="Changelog.md#Clip-11">11</a>, <a href="Changelog.md#Clip-6">6</a>, <a href="Changelog.md#Clip-1">1</a>|13|
|<a href="#DynamicQuantizeLinear">DynamicQuantizeLinear</a>|<a href="Changelog.md#DynamicQuantizeLinear-11">11</a>|11|
|<a href="#Elu">Elu</a>|<a href="Changelog.md#Elu-6">6</a>, <a href="Changelog.md#Elu-1">1</a>|18|
|<a href="#Gelu">Gelu</a>|<a href="Changelog.md#Gelu-20">20</a>|20|
|<a href="#GreaterOrEqual">GreaterOrEqual</a>|<a href="Changelog.md#GreaterOrEqual-16">16</a>, <a href="Changelog.md#GreaterOrEqual-12">12</a>|16|
|<a href="#GroupNormalization">GroupNormalization</a>|<a href="Changelog.md#GroupNormalization-21">21</a>, <a href="Changelog.md#GroupNormalization-18">18</a>|21|
|<a href="#HammingWindow">HammingWindow</a>|<a href="Changelog.md#HammingWindow-17">17</a>|17|
|<a href="#HannWindow">HannWindow</a>|<a href="Changelog.md#HannWindow-17">17</a>|17|
|<a href="#HardSigmoid">HardSigmoid</a>|<a href="Changelog.md#HardSigmoid-6">6</a>, <a href="Changelog.md#HardSigmoid-1">1</a>|18|
|<a href="#HardSwish">HardSwish</a>|<a href="Changelog.md#HardSwish-14">14</a>|14|
|<a href="#LayerNormalization">LayerNormalization</a>|<a href="Changelog.md#LayerNormalization-17">17</a>|17, 18|
|<a href="#LeakyRelu">LeakyRelu</a>|<a href="Changelog.md#LeakyRelu-16">16</a>, <a href="Changelog.md#LeakyRelu-6">6</a>, <a href="Changelog.md#LeakyRelu-1">1</a>|16|
|<a href="#LessOrEqual">LessOrEqual</a>|<a href="Changelog.md#LessOrEqual-16">16</a>, <a href="Changelog.md#LessOrEqual-12">12</a>|16|
|<a href="#LogSoftmax">LogSoftmax</a>|<a href="Changelog.md#LogSoftmax-13">13</a>, <a href="Changelog.md#LogSoftmax-11">11</a>, <a href="Changelog.md#LogSoftmax-1">1</a>|13, 18|
|<a href="#MeanVarianceNormalization">MeanVarianceNormalization</a>|<a href="Changelog.md#MeanVarianceNormalization-13">13</a>, <a href="Changelog.md#MeanVarianceNormalization-9">9</a>|13, 18|
|<a href="#Mish">Mish</a>|<a href="Changelog.md#Mish-18">18</a>|18|
|<a href="#NegativeLogLikelihoodLoss">NegativeLogLikelihoodLoss</a>|<a href="Changelog.md#NegativeLogLikelihoodLoss-13">13</a>, <a href="Changelog.md#NegativeLogLikelihoodLoss-12">12</a>|13|
|<a href="#PRelu">PRelu</a>|<a href="Changelog.md#PRelu-16">16</a>, <a href="Changelog.md#PRelu-9">9</a>, <a href="Changelog.md#PRelu-7">7</a>, <a href="Changelog.md#PRelu-6">6</a>, <a href="Changelog.md#PRelu-1">1</a>|16|
|<a href="#Range">Range</a>|<a href="Changelog.md#Range-11">11</a>|11|
|<a href="#ReduceL1">ReduceL1</a>|<a href="Changelog.md#ReduceL1-18">18</a>, <a href="Changelog.md#ReduceL1-13">13</a>, <a href="Changelog.md#ReduceL1-11">11</a>, <a href="Changelog.md#ReduceL1-1">1</a>|18|
|<a href="#ReduceL2">ReduceL2</a>|<a href="Changelog.md#ReduceL2-18">18</a>, <a href="Changelog.md#ReduceL2-13">13</a>, <a href="Changelog.md#ReduceL2-11">11</a>, <a href="Changelog.md#ReduceL2-1">1</a>|18|
|<a href="#ReduceLogSum">ReduceLogSum</a>|<a href="Changelog.md#ReduceLogSum-18">18</a>, <a href="Changelog.md#ReduceLogSum-13">13</a>, <a href="Changelog.md#ReduceLogSum-11">11</a>, <a href="Changelog.md#ReduceLogSum-1">1</a>|18|
|<a href="#ReduceLogSumExp">ReduceLogSumExp</a>|<a href="Changelog.md#ReduceLogSumExp-18">18</a>, <a href="Changelog.md#ReduceLogSumExp-13">13</a>, <a href="Changelog.md#ReduceLogSumExp-11">11</a>, <a href="Changelog.md#ReduceLogSumExp-1">1</a>|18|
|<a href="#ReduceSumSquare">ReduceSumSquare</a>|<a href="Changelog.md#ReduceSumSquare-18">18</a>, <a href="Changelog.md#ReduceSumSquare-13">13</a>, <a href="Changelog.md#ReduceSumSquare-11">11</a>, <a href="Changelog.md#ReduceSumSquare-1">1</a>|18|
|<a href="#Relu">Relu</a>|<a href="Changelog.md#Relu-14">14</a>, <a href="Changelog.md#Relu-13">13</a>, <a href="Changelog.md#Relu-6">6</a>, <a href="Changelog.md#Relu-1">1</a>|18|
|<a href="#Selu">Selu</a>|<a href="Changelog.md#Selu-6">6</a>, <a href="Changelog.md#Selu-1">1</a>|18|
|<a href="#SequenceMap">SequenceMap</a>|<a href="Changelog.md#SequenceMap-17">17</a>|17|
|<a href="#Shrink">Shrink</a>|<a href="Changelog.md#Shrink-9">9</a>|18|
|<a href="#Softmax">Softmax</a>|<a href="Changelog.md#Softmax-13">13</a>, <a href="Changelog.md#Softmax-11">11</a>, <a href="Changelog.md#Softmax-1">1</a>|13, 18|
|<a href="#SoftmaxCrossEntropyLoss">SoftmaxCrossEntropyLoss</a>|<a href="Changelog.md#SoftmaxCrossEntropyLoss-13">13</a>, <a href="Changelog.md#SoftmaxCrossEntropyLoss-12">12</a>|13|
|<a href="#Softplus">Softplus</a>|<a href="Changelog.md#Softplus-1">1</a>|18|
|<a href="#Softsign">Softsign</a>|<a href="Changelog.md#Softsign-1">1</a>|18|
|<a href="#ThresholdedRelu">ThresholdedRelu</a>|<a href="Changelog.md#ThresholdedRelu-10">10</a>|18|

### ai.onnx.preview.training
|**Operator**|**Since version**||
|-|-|-|
|<a href="#ai.onnx.preview.training.Adagrad">ai.onnx.preview.training.Adagrad</a>|<a href="Changelog.md#ai.onnx.preview.training.Adagrad-1">1</a>|
|<a href="#ai.onnx.preview.training.Adam">ai.onnx.preview.training.Adam</a>|<a href="Changelog.md#ai.onnx.preview.training.Adam-1">1</a>|
|<a href="#ai.onnx.preview.training.Gradient">ai.onnx.preview.training.Gradient</a>|<a href="Changelog.md#ai.onnx.preview.training.Gradient-1">1</a>|
|<a href="#ai.onnx.preview.training.Momentum">ai.onnx.preview.training.Momentum</a>|<a href="Changelog.md#ai.onnx.preview.training.Momentum-1">1</a>|


## ai.onnx (default)
### <a name="Abs"></a><a name="abs">**Abs**</a>

  Absolute takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where absolute value, y = abs(x), is applied to
  the tensor elementwise.

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Abs-1">1</a>, <a href="Changelog.md#Abs-6">6</a>

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to all numeric tensors.</dd>
</dl>


#### Examples

<details>
<summary>abs</summary>

```python
node = onnx.helper.make_node(
    "Abs",
    inputs=["x"],
    outputs=["y"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = abs(x)

expect(node, inputs=[x], outputs=[y], name="test_abs")
```

</details>


#### Sample Implementation

<details>
<summary>Abs</summary>

```python
# SPDX-License-Identifier: Apache-2.0

import numpy as np


def abs(input: np.ndarray) -> np.ndarray:  # noqa: A001
    return np.abs(input)  # type: ignore[no-any-return]

```

</details>


### <a name="Acos"></a><a name="acos">**Acos**</a>

  Calculates the arccosine (inverse of cosine) of the given input tensor, element-wise.

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>The arccosine of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>acos</summary>

```python
node = onnx.helper.make_node(
    "Acos",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([-0.5, 0, 0.5]).astype(np.float32)
y = np.arccos(x)
expect(node, inputs=[x], outputs=[y], name="test_acos_example")

x = np.random.rand(3, 4, 5).astype(np.float32)
y = np.arccos(x)
expect(node, inputs=[x], outputs=[y], name="test_acos")
```

</details>


### <a name="Acosh"></a><a name="acosh">**Acosh**</a>

  Calculates the hyperbolic arccosine of the given input tensor element-wise.

#### Version

This version of the operator has been available since version 9 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>The hyperbolic arccosine values of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>acosh</summary>

```python
node = onnx.helper.make_node(
    "Acosh",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([10, np.e, 1]).astype(np.float32)
y = np.arccosh(x)  # expected output [2.99322295,  1.65745449,  0.]
expect(node, inputs=[x], outputs=[y], name="test_acosh_example")

x = np.random.uniform(1.0, 10.0, (3, 4, 5)).astype(np.float32)
y = np.arccosh(x)
expect(node, inputs=[x], outputs=[y], name="test_acosh")
```

</details>


### <a name="Add"></a><a name="add">**Add**</a>

  Performs element-wise binary addition (with Numpy-style broadcasting support).

  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

  (Opset 14 change): Extend supported types to include uint8, int8, uint16, and int16.

#### Version

This version of the operator has been available since version 14 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Add-1">1</a>, <a href="Changelog.md#Add-6">6</a>, <a href="Changelog.md#Add-7">7</a>, <a href="Changelog.md#Add-13">13</a>

#### Inputs

<dl>
<dt><tt>A</tt> (differentiable) : T</dt>
<dd>First operand.</dd>
<dt><tt>B</tt> (differentiable) : T</dt>
<dd>Second operand.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> (differentiable) : T</dt>
<dd>Result, has same element type as two inputs</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to all numeric tensors.</dd>
</dl>


#### Examples

<details>
<summary>add</summary>

```python
node = onnx.helper.make_node(
    "Add",
    inputs=["x", "y"],
    outputs=["sum"],
)

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.randn(3, 4, 5).astype(np.float32)
expect(node, inputs=[x, y], outputs=[x + y], name="test_add")
```

</details>


<details>
<summary>add_broadcast</summary>

```python
node = onnx.helper.make_node(
    "Add",
    inputs=["x", "y"],
    outputs=["sum"],
)

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.randn(5).astype(np.float32)
expect(node, inputs=[x, y], outputs=[x + y], name="test_add_bcast")
```

</details>


<details>
<summary>add_uint8</summary>

```python
node = onnx.helper.make_node(
    "Add",
    inputs=["x", "y"],
    outputs=["sum"],
)

x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint8)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint8)
expect(node, inputs=[x, y], outputs=[x + y], name="test_add_uint8")
```

</details>


### <a name="AffineGrid"></a><a name="affinegrid">**AffineGrid**</a>

  Generates a 2D or 3D flow field (sampling grid), given a batch of affine matrices theta
  (https://pytorch.org/docs/stable/generated/torch.nn.functional.affine_grid.html).
  An affine matrix `theta` is applied to a position tensor represented in its homogeneous expression. Here is an example in 3D:
  ```
  [r00, r01, r02, t0]   [x]   [x']
  [r10, r11, r12, t1] * [y] = [y']
  [r20, r21, r22, t2]   [z]   [z']
  [0,   0,   0,   1 ]   [1]   [1 ]
  ```
  where `(x, y, z)` is the position in the original space, `(x', y', z')` is the position in the output space.
  The last row is always `[0, 0, 0, 1]` and is not stored in the affine matrix. Therefore we have `theta` of shape `(N, 2, 3)` for 2D or `(N, 3, 4)` for 3D.

  Input `size` is used to define grid of positions evenly spaced in the original 2D or 3D space, with dimensions ranging from `-1` to `1`.
  The output `grid` contains positions in the output space.

  When `align_corners=1`, consider `-1` and `1` to refer to the centers of the corner pixels (mark `v` in illustration).
  ```
  v            v            v            v
  |-------------------|------------------|
  -1                  0                  1
  ```
  When `align_corners=0`, consider `-1` and `1` to refer to the outer edge of the corner pixels.
  ```
      v        v         v         v
  |------------------|-------------------|
  -1                 0                   1
  ```

#### Version

This version of the operator has been available since version 20 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>align_corners</tt> : int (default is 0)</dt>
<dd>if align_corners=1, consider -1 and 1 to refer to the centers of the corner pixels. if align_corners=0, consider -1 and 1 to refer to the outer edge the corner pixels.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>theta</tt> (non-differentiable) : T1</dt>
<dd>input batch of affine matrices with shape (N, 2, 3) for 2D or (N, 3, 4) for 3D</dd>
<dt><tt>size</tt> (non-differentiable) : T2</dt>
<dd>the target output image size (N, C, H, W) for 2D or (N, C, D, H, W) for 3D</dd>
</dl>

#### Outputs

<dl>
<dt><tt>grid</tt> (differentiable) : T1</dt>
<dd>output tensor of shape (N, H, W, 2) of 2D sample coordinates or (N, D, H, W, 3) of 3D sample coordinates.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(bfloat16), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain grid types to float tensors.</dd>
<dt><tt>T2</tt> : tensor(int64)</dt>
<dd>Constrain size's type to int64 tensors.</dd>
</dl>


#### Examples

<details>
<summary>2d_no_reference_evaluator</summary>

```python
theta_2d = create_theta_2d()
N, C, H, W = len(theta_2d), 3, 5, 6
data_size = (H, W)
for align_corners in (0, 1):
    node = onnx.helper.make_node(
        "AffineGrid",
        inputs=["theta", "size"],
        outputs=["grid"],
        align_corners=align_corners,
    )

    original_grid = construct_original_grid(data_size, align_corners)
    grid = apply_affine_transform(theta_2d, original_grid)

    test_name = "test_affine_grid_2d"
    if align_corners == 1:
        test_name += "_align_corners"
    expect(
        node,
        inputs=[theta_2d, np.array([N, C, H, W], dtype=np.int64)],
        outputs=[grid],
        name=test_name,
    )
```

</details>


<details>
<summary>3d_no_reference_evaluator</summary>

```python
theta_3d = create_theta_3d()
N, C, D, H, W = len(theta_3d), 3, 4, 5, 6
data_size = (D, H, W)
for align_corners in (0, 1):
    node = onnx.helper.make_node(
        "AffineGrid",
        inputs=["theta", "size"],
        outputs=["grid"],
        align_corners=align_corners,
    )

    original_grid = construct_original_grid(data_size, align_corners)
    grid = apply_affine_transform(theta_3d, original_grid)

    test_name = "test_affine_grid_3d"
    if align_corners == 1:
        test_name += "_align_corners"
    expect(
        node,
        inputs=[theta_3d, np.array([N, C, D, H, W], dtype=np.int64)],
        outputs=[grid],
        name=test_name,
    )
```

</details>


### <a name="And"></a><a name="and">**And**</a>

  Returns the tensor resulted from performing the `and` logical operation
  elementwise on the input tensors `A` and `B` (with Numpy-style broadcasting support).

  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#And-1">1</a>

#### Inputs

<dl>
<dt><tt>A</tt> (non-differentiable) : T</dt>
<dd>First input operand for the logical operator.</dd>
<dt><tt>B</tt> (non-differentiable) : T</dt>
<dd>Second input operand for the logical operator.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> (non-differentiable) : T1</dt>
<dd>Result tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(bool)</dt>
<dd>Constrain input to boolean tensor.</dd>
<dt><tt>T1</tt> : tensor(bool)</dt>
<dd>Constrain output to boolean tensor.</dd>
</dl>


#### Examples

<details>
<summary>and</summary>

```python
node = onnx.helper.make_node(
    "And",
    inputs=["x", "y"],
    outputs=["and"],
)

# 2d
x = (np.random.randn(3, 4) > 0).astype(bool)
y = (np.random.randn(3, 4) > 0).astype(bool)
z = np.logical_and(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_and2d")

# 3d
x = (np.random.randn(3, 4, 5) > 0).astype(bool)
y = (np.random.randn(3, 4, 5) > 0).astype(bool)
z = np.logical_and(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_and3d")

# 4d
x = (np.random.randn(3, 4, 5, 6) > 0).astype(bool)
y = (np.random.randn(3, 4, 5, 6) > 0).astype(bool)
z = np.logical_and(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_and4d")
```

</details>


<details>
<summary>and_broadcast</summary>

```python
node = onnx.helper.make_node(
    "And",
    inputs=["x", "y"],
    outputs=["and"],
)

# 3d vs 1d
x = (np.random.randn(3, 4, 5) > 0).astype(bool)
y = (np.random.randn(5) > 0).astype(bool)
z = np.logical_and(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_and_bcast3v1d")

# 3d vs 2d
x = (np.random.randn(3, 4, 5) > 0).astype(bool)
y = (np.random.randn(4, 5) > 0).astype(bool)
z = np.logical_and(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_and_bcast3v2d")

# 4d vs 2d
x = (np.random.randn(3, 4, 5, 6) > 0).astype(bool)
y = (np.random.randn(5, 6) > 0).astype(bool)
z = np.logical_and(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_and_bcast4v2d")

# 4d vs 3d
x = (np.random.randn(3, 4, 5, 6) > 0).astype(bool)
y = (np.random.randn(4, 5, 6) > 0).astype(bool)
z = np.logical_and(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_and_bcast4v3d")

# 4d vs 4d
x = (np.random.randn(1, 4, 1, 6) > 0).astype(bool)
y = (np.random.randn(3, 1, 5, 6) > 0).astype(bool)
z = np.logical_and(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_and_bcast4v4d")
```

</details>


### <a name="ArgMax"></a><a name="argmax">**ArgMax**</a>

  Computes the indices of the max elements of the input tensor's element along the
  provided axis. The resulting tensor has the same rank as the input if keepdims equals 1.
  If keepdims equals 0, then the resulting tensor has the reduced dimension pruned.
  If select_last_index is True (default False), the index of the last occurrence of the max
  is selected if the max appears more than once in the input. Otherwise the index of the
  first occurrence is selected.
  The type of the output tensor is integer.

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#ArgMax-1">1</a>, <a href="Changelog.md#ArgMax-11">11</a>, <a href="Changelog.md#ArgMax-12">12</a>

#### Attributes

<dl>
<dt><tt>axis</tt> : int (default is 0)</dt>
<dd>The axis in which to compute the arg indices. Accepted range is [-r, r-1] where r = rank(data).</dd>
<dt><tt>keepdims</tt> : int (default is 1)</dt>
<dd>Keep the reduced dimension or not, default 1 means keep reduced dimension.</dd>
<dt><tt>select_last_index</tt> : int (default is 0)</dt>
<dd>Whether to select the last index or the first index if the {name} appears in multiple indices, default is False (first index).</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> (non-differentiable) : T</dt>
<dd>An input tensor.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>reduced</tt> (non-differentiable) : tensor(int64)</dt>
<dd>Reduced output tensor with integer data type.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to all numeric tensors.</dd>
</dl>


#### Examples

<details>
<summary>default_axes_keepdims</summary>

```python
data = np.array([[2, 1], [3, 10]], dtype=np.float32)
keepdims = 1
node = onnx.helper.make_node(
    "ArgMax", inputs=["data"], outputs=["result"], keepdims=keepdims
)

# result: [[1, 1]]
result = argmax_use_numpy(data, keepdims=keepdims)
expect(
    node,
    inputs=[data],
    outputs=[result],
    name="test_argmax_default_axis_example",
)

data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [1, 3, 4]
result = argmax_use_numpy(data, keepdims=keepdims)
expect(
    node,
    inputs=[data],
    outputs=[result],
    name="test_argmax_default_axis_random",
)
```

</details>


<details>
<summary>default_axes_keepdims_select_last_index</summary>

```python
data = np.array([[2, 2], [3, 10]], dtype=np.float32)
keepdims = 1
node = onnx.helper.make_node(
    "ArgMax",
    inputs=["data"],
    outputs=["result"],
    keepdims=keepdims,
    select_last_index=True,
)

# result: [[1, 1]]
result = argmax_use_numpy_select_last_index(data, keepdims=keepdims)
expect(
    node,
    inputs=[data],
    outputs=[result],
    name="test_argmax_default_axis_example_select_last_index",
)

data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [1, 3, 4]
result = argmax_use_numpy_select_last_index(data, keepdims=keepdims)
expect(
    node,
    inputs=[data],
    outputs=[result],
    name="test_argmax_default_axis_random_select_last_index",
)
```

</details>


<details>
<summary>keepdims</summary>

```python
data = np.array([[2, 1], [3, 10]], dtype=np.float32)
axis = 1
keepdims = 1
node = onnx.helper.make_node(
    "ArgMax", inputs=["data"], outputs=["result"], axis=axis, keepdims=keepdims
)
# result: [[0], [1]]
result = argmax_use_numpy(data, axis=axis, keepdims=keepdims)
expect(
    node, inputs=[data], outputs=[result], name="test_argmax_keepdims_example"
)

data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [2, 1, 4]
result = argmax_use_numpy(data, axis=axis, keepdims=keepdims)
expect(
    node, inputs=[data], outputs=[result], name="test_argmax_keepdims_random"
)
```

</details>


<details>
<summary>keepdims_select_last_index</summary>

```python
data = np.array([[2, 2], [3, 10]], dtype=np.float32)
axis = 1
keepdims = 1
node = onnx.helper.make_node(
    "ArgMax",
    inputs=["data"],
    outputs=["result"],
    axis=axis,
    keepdims=keepdims,
    select_last_index=True,
)
# result: [[1], [1]]
result = argmax_use_numpy_select_last_index(data, axis=axis, keepdims=keepdims)
expect(
    node,
    inputs=[data],
    outputs=[result],
    name="test_argmax_keepdims_example_select_last_index",
)

data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [2, 1, 4]
result = argmax_use_numpy_select_last_index(data, axis=axis, keepdims=keepdims)
expect(
    node,
    inputs=[data],
    outputs=[result],
    name="test_argmax_keepdims_random_select_last_index",
)
```

</details>


<details>
<summary>negative_axis_keepdims</summary>

```python
data = np.array([[2, 1], [3, 10]], dtype=np.float32)
axis = -1
keepdims = 1
node = onnx.helper.make_node(
    "ArgMax", inputs=["data"], outputs=["result"], axis=axis, keepdims=keepdims
)
# result: [[0], [1]]
result = argmax_use_numpy(data, axis=axis, keepdims=keepdims)
expect(
    node,
    inputs=[data],
    outputs=[result],
    name="test_argmax_negative_axis_keepdims_example",
)

data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [2, 3, 1]
result = argmax_use_numpy(data, axis=axis, keepdims=keepdims)
expect(
    node,
    inputs=[data],
    outputs=[result],
    name="test_argmax_negative_axis_keepdims_random",
)
```

</details>


<details>
<summary>negative_axis_keepdims_select_last_index</summary>

```python
data = np.array([[2, 2], [3, 10]], dtype=np.float32)
axis = -1
keepdims = 1
node = onnx.helper.make_node(
    "ArgMax",
    inputs=["data"],
    outputs=["result"],
    axis=axis,
    keepdims=keepdims,
    select_last_index=True,
)
# result: [[1], [1]]
result = argmax_use_numpy_select_last_index(data, axis=axis, keepdims=keepdims)
expect(
    node,
    inputs=[data],
    outputs=[result],
    name="test_argmax_negative_axis_keepdims_example_select_last_index",
)

data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [2, 3, 1]
result = argmax_use_numpy_select_last_index(data, axis=axis, keepdims=keepdims)
expect(
    node,
    inputs=[data],
    outputs=[result],
    name="test_argmax_negative_axis_keepdims_random_select_last_index",
)
```

</details>


<details>
<summary>no_keepdims</summary>

```python
data = np.array([[2, 1], [3, 10]], dtype=np.float32)
axis = 1
keepdims = 0
node = onnx.helper.make_node(
    "ArgMax", inputs=["data"], outputs=["result"], axis=axis, keepdims=keepdims
)
# result: [0, 1]
result = argmax_use_numpy(data, axis=axis, keepdims=keepdims)
expect(
    node,
    inputs=[data],
    outputs=[result],
    name="test_argmax_no_keepdims_example",
)

data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [2, 4]
result = argmax_use_numpy(data, axis=axis, keepdims=keepdims)
expect(
    node, inputs=[data], outputs=[result], name="test_argmax_no_keepdims_random"
)
```

</details>


<details>
<summary>no_keepdims_select_last_index</summary>

```python
data = np.array([[2, 2], [3, 10]], dtype=np.float32)
axis = 1
keepdims = 0
node = onnx.helper.make_node(
    "ArgMax",
    inputs=["data"],
    outputs=["result"],
    axis=axis,
    keepdims=keepdims,
    select_last_index=True,
)
# result: [1, 1]
result = argmax_use_numpy_select_last_index(data, axis=axis, keepdims=keepdims)
expect(
    node,
    inputs=[data],
    outputs=[result],
    name="test_argmax_no_keepdims_example_select_last_index",
)

data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [2, 4]
result = argmax_use_numpy_select_last_index(data, axis=axis, keepdims=keepdims)
expect(
    node,
    inputs=[data],
    outputs=[result],
    name="test_argmax_no_keepdims_random_select_last_index",
)
```

</details>


### <a name="ArgMin"></a><a name="argmin">**ArgMin**</a>

  Computes the indices of the min elements of the input tensor's element along the
  provided axis. The resulting tensor has the same rank as the input if keepdims equals 1.
  If keepdims equals 0, then the resulting tensor has the reduced dimension pruned.
  If select_last_index is True (default False), the index of the last occurrence of the min
  is selected if the min appears more than once in the input. Otherwise the index of the
  first occurrence is selected.
  The type of the output tensor is integer.

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#ArgMin-1">1</a>, <a href="Changelog.md#ArgMin-11">11</a>, <a href="Changelog.md#ArgMin-12">12</a>

#### Attributes

<dl>
<dt><tt>axis</tt> : int (default is 0)</dt>
<dd>The axis in which to compute the arg indices. Accepted range is [-r, r-1] where r = rank(data).</dd>
<dt><tt>keepdims</tt> : int (default is 1)</dt>
<dd>Keep the reduced dimension or not, default 1 means keep reduced dimension.</dd>
<dt><tt>select_last_index</tt> : int (default is 0)</dt>
<dd>Whether to select the last index or the first index if the {name} appears in multiple indices, default is False (first index).</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> (non-differentiable) : T</dt>
<dd>An input tensor.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>reduced</tt> (non-differentiable) : tensor(int64)</dt>
<dd>Reduced output tensor with integer data type.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to all numeric tensors.</dd>
</dl>


#### Examples

<details>
<summary>default_axes_keepdims</summary>

```python
data = np.array([[2, 1], [3, 10]], dtype=np.float32)
keepdims = 1
node = onnx.helper.make_node(
    "ArgMin", inputs=["data"], outputs=["result"], keepdims=keepdims
)

# The content of result is : [[0], [0]]
result = argmin_use_numpy(data, keepdims=keepdims)
expect(
    node,
    inputs=[data],
    outputs=[result],
    name="test_argmin_default_axis_example",
)

data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [1, 3, 4]
result = argmin_use_numpy(data, keepdims=keepdims)
expect(
    node,
    inputs=[data],
    outputs=[result],
    name="test_argmin_default_axis_random",
)
```

</details>


<details>
<summary>default_axes_keepdims_select_last_index</summary>

```python
data = np.array([[2, 2], [3, 10]], dtype=np.float32)
keepdims = 1
node = onnx.helper.make_node(
    "ArgMin",
    inputs=["data"],
    outputs=["result"],
    keepdims=keepdims,
    select_last_index=True,
)

# result: [[0, 0]]
result = argmin_use_numpy_select_last_index(data, keepdims=keepdims)
expect(
    node,
    inputs=[data],
    outputs=[result],
    name="test_argmin_default_axis_example_select_last_index",
)

data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [1, 3, 4]
result = argmin_use_numpy_select_last_index(data, keepdims=keepdims)
expect(
    node,
    inputs=[data],
    outputs=[result],
    name="test_argmin_default_axis_random_select_last_index",
)
```

</details>


<details>
<summary>keepdims</summary>

```python
data = np.array([[2, 1], [3, 10]], dtype=np.float32)
axis = 1
keepdims = 1
node = onnx.helper.make_node(
    "ArgMin", inputs=["data"], outputs=["result"], axis=axis, keepdims=keepdims
)
# The content of result is : [[1], [0]]
result = argmin_use_numpy(data, axis=axis, keepdims=keepdims)
expect(
    node, inputs=[data], outputs=[result], name="test_argmin_keepdims_example"
)

data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [2, 1, 4]
result = argmin_use_numpy(data, axis=axis, keepdims=keepdims)
expect(
    node, inputs=[data], outputs=[result], name="test_argmin_keepdims_random"
)
```

</details>


<details>
<summary>keepdims_select_last_index</summary>

```python
data = np.array([[2, 2], [3, 10]], dtype=np.float32)
axis = 1
keepdims = 1
node = onnx.helper.make_node(
    "ArgMin",
    inputs=["data"],
    outputs=["result"],
    axis=axis,
    keepdims=keepdims,
    select_last_index=True,
)
# result: [[1], [0]]
result = argmin_use_numpy_select_last_index(data, axis=axis, keepdims=keepdims)
expect(
    node,
    inputs=[data],
    outputs=[result],
    name="test_argmin_keepdims_example_select_last_index",
)

data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [2, 1, 4]
result = argmin_use_numpy_select_last_index(data, axis=axis, keepdims=keepdims)
expect(
    node,
    inputs=[data],
    outputs=[result],
    name="test_argmin_keepdims_random_select_last_index",
)
```

</details>


<details>
<summary>negative_axis_keepdims</summary>

```python
data = np.array([[2, 1], [3, 10]], dtype=np.float32)
axis = -1
keepdims = 1
node = onnx.helper.make_node(
    "ArgMin", inputs=["data"], outputs=["result"], axis=axis, keepdims=keepdims
)
# The content of result is : [[1], [0]]
result = argmin_use_numpy(data, axis=axis, keepdims=keepdims)
expect(
    node,
    inputs=[data],
    outputs=[result],
    name="test_argmin_negative_axis_keepdims_example",
)

data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [2, 3, 1]
result = argmin_use_numpy(data, axis=axis, keepdims=keepdims)
expect(
    node,
    inputs=[data],
    outputs=[result],
    name="test_argmin_negative_axis_keepdims_random",
)
```

</details>


<details>
<summary>negative_axis_keepdims_select_last_index</summary>

```python
data = np.array([[2, 2], [3, 10]], dtype=np.float32)
axis = -1
keepdims = 1
node = onnx.helper.make_node(
    "ArgMin",
    inputs=["data"],
    outputs=["result"],
    axis=axis,
    keepdims=keepdims,
    select_last_index=True,
)
# result: [[1], [0]]
result = argmin_use_numpy_select_last_index(data, axis=axis, keepdims=keepdims)
expect(
    node,
    inputs=[data],
    outputs=[result],
    name="test_argmin_negative_axis_keepdims_example_select_last_index",
)

data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [2, 3, 1]
result = argmin_use_numpy_select_last_index(data, axis=axis, keepdims=keepdims)
expect(
    node,
    inputs=[data],
    outputs=[result],
    name="test_argmin_negative_axis_keepdims_random_select_last_index",
)
```

</details>


<details>
<summary>no_keepdims</summary>

```python
data = np.array([[2, 1], [3, 10]], dtype=np.float32)
axis = 1
keepdims = 0
node = onnx.helper.make_node(
    "ArgMin", inputs=["data"], outputs=["result"], axis=axis, keepdims=keepdims
)
# The content of result is : [[1, 0]]
result = argmin_use_numpy(data, axis=axis, keepdims=keepdims)
expect(
    node,
    inputs=[data],
    outputs=[result],
    name="test_argmin_no_keepdims_example",
)

data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [2, 4]
result = argmin_use_numpy(data, axis=axis, keepdims=keepdims)
expect(
    node, inputs=[data], outputs=[result], name="test_argmin_no_keepdims_random"
)
```

</details>


<details>
<summary>no_keepdims_select_last_index</summary>

```python
data = np.array([[2, 2], [3, 10]], dtype=np.float32)
axis = 1
keepdims = 0
node = onnx.helper.make_node(
    "ArgMin",
    inputs=["data"],
    outputs=["result"],
    axis=axis,
    keepdims=keepdims,
    select_last_index=True,
)
# result: [[1, 0]]
result = argmin_use_numpy_select_last_index(data, axis=axis, keepdims=keepdims)
expect(
    node,
    inputs=[data],
    outputs=[result],
    name="test_argmin_no_keepdims_example_select_last_index",
)

data = np.random.uniform(-10, 10, [2, 3, 4]).astype(np.float32)
# result's shape: [2, 4]
result = argmin_use_numpy_select_last_index(data, axis=axis, keepdims=keepdims)
expect(
    node,
    inputs=[data],
    outputs=[result],
    name="test_argmin_no_keepdims_random_select_last_index",
)
```

</details>


### <a name="Asin"></a><a name="asin">**Asin**</a>

  Calculates the arcsine (inverse of sine) of the given input tensor, element-wise.

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>The arcsine of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>asin</summary>

```python
node = onnx.helper.make_node(
    "Asin",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([-0.5, 0, 0.5]).astype(np.float32)
y = np.arcsin(x)
expect(node, inputs=[x], outputs=[y], name="test_asin_example")

x = np.random.rand(3, 4, 5).astype(np.float32)
y = np.arcsin(x)
expect(node, inputs=[x], outputs=[y], name="test_asin")
```

</details>


### <a name="Asinh"></a><a name="asinh">**Asinh**</a>

  Calculates the hyperbolic arcsine of the given input tensor element-wise.

#### Version

This version of the operator has been available since version 9 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>The hyperbolic arcsine values of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>asinh</summary>

```python
node = onnx.helper.make_node(
    "Asinh",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([-1, 0, 1]).astype(np.float32)
y = np.arcsinh(x)  # expected output [-0.88137358,  0.,  0.88137358]
expect(node, inputs=[x], outputs=[y], name="test_asinh_example")

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.arcsinh(x)
expect(node, inputs=[x], outputs=[y], name="test_asinh")
```

</details>


### <a name="Atan"></a><a name="atan">**Atan**</a>

  Calculates the arctangent (inverse of tangent) of the given input tensor, element-wise.

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>The arctangent of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>atan</summary>

```python
node = onnx.helper.make_node(
    "Atan",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([-1, 0, 1]).astype(np.float32)
y = np.arctan(x)
expect(node, inputs=[x], outputs=[y], name="test_atan_example")

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.arctan(x)
expect(node, inputs=[x], outputs=[y], name="test_atan")
```

</details>


### <a name="Atanh"></a><a name="atanh">**Atanh**</a>

  Calculates the hyperbolic arctangent of the given input tensor element-wise.

#### Version

This version of the operator has been available since version 9 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>The hyperbolic arctangent values of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>atanh</summary>

```python
node = onnx.helper.make_node(
    "Atanh",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([-0.5, 0, 0.5]).astype(np.float32)
y = np.arctanh(x)  # expected output [-0.54930615,  0.,  0.54930615]
expect(node, inputs=[x], outputs=[y], name="test_atanh_example")

x = np.random.uniform(0.0, 1.0, (3, 4, 5)).astype(np.float32)
y = np.arctanh(x)
expect(node, inputs=[x], outputs=[y], name="test_atanh")
```

</details>


### <a name="AveragePool"></a><a name="averagepool">**AveragePool**</a>

  AveragePool consumes an input tensor X and applies average pooling across
   the tensor according to kernel sizes, stride sizes, and pad lengths.
   average pooling consisting of computing the average on all values of a
   subset of the input tensor according to the kernel size and downsampling the
   data into the output tensor Y for further processing. The output spatial shape is calculated differently
   depending on whether explicit padding is used, where pads is employed, or auto padding is used, where auto_pad is utilized.
   With explicit padding (https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html?highlight=maxpool#torch.nn.MaxPool2d):
   ```
   output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - dilation[i] * (kernel_shape[i] - 1) - 1) / strides_spatial_shape[i] + 1)
   ```
   or
   ```
   output_spatial_shape[i] = ceil((input_spatial_shape[i] + pad_shape[i] - dilation[i] * (kernel_shape[i] - 1) - 1) / strides_spatial_shape[i] + 1)
   ```
   if ceil_mode is enabled. `pad_shape[i]` is the sum of pads along axis `i`. Sliding windows that would start in the right padded region are ignored.

   `auto_pad` is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following when ceil_mode is enabled:
   ```
   VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) + 1) / strides_spatial_shape[i])
   SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i])
   ```
   or when ceil_mode is disabled (https://www.tensorflow.org/api_docs/python/tf/keras/layers/AveragePooling2D):
   ```
   VALID: output_spatial_shape[i] = floor((input_spatial_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1)) / strides_spatial_shape[i]) + 1
   SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = floor((input_spatial_shape[i] - 1) / strides_spatial_shape[i]) + 1
   ```
   And pad shape will be following if `SAME_UPPER` or `SAME_LOWER`:
   ```
   pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) - input_spatial_shape[i]
   ```
   The output of each pooling window is divided by the number of elements (exclude pad when attribute count_include_pad is zero).


#### Version

This version of the operator has been available since version 19 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#AveragePool-1">1</a>, <a href="Changelog.md#AveragePool-7">7</a>, <a href="Changelog.md#AveragePool-10">10</a>, <a href="Changelog.md#AveragePool-11">11</a>

#### Attributes

<dl>
<dt><tt>auto_pad</tt> : string (default is NOTSET)</dt>
<dd>auto_pad must be either NOTSET, SAME_UPPER, SAME_LOWER or VALID. Where default value is NOTSET, which means explicit padding is used. SAME_UPPER or SAME_LOWER mean pad the input so that `output_shape[i] = ceil(input_shape[i] / strides[i])` for each axis `i`. The padding is split between the two sides equally or almost equally (depending on whether it is even or odd). In case the padding is an odd number, the extra padding is added at the end for SAME_UPPER and at the beginning for SAME_LOWER.</dd>
<dt><tt>ceil_mode</tt> : int (default is 0)</dt>
<dd>Whether to use ceil or floor (default) to compute the output shape.</dd>
<dt><tt>count_include_pad</tt> : int (default is 0)</dt>
<dd>Whether include pad pixels when calculating values for the edges. Default is 0, doesn't count include pad.</dd>
<dt><tt>dilations</tt> : list of ints</dt>
<dd>Dilation value along each spatial axis of filter. If not present, the dilation defaults to 1 along each spatial axis.</dd>
<dt><tt>kernel_shape</tt> : list of ints (required)</dt>
<dd>The size of the kernel along each axis.</dd>
<dt><tt>pads</tt> : list of ints</dt>
<dd>Padding for the beginning and ending along each spatial axis, it can take any value greater than or equal to 0. The value represent the number of pixels added to the beginning and end part of the corresponding axis. `pads` format should be as follow [x1_begin, x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels added at the beginning of axis `i` and xi_end, the number of pixels added at the end of axis `i`. This attribute cannot be used simultaneously with auto_pad attribute. If not present, the padding defaults to 0 along start and end of each spatial axis.</dd>
<dt><tt>strides</tt> : list of ints</dt>
<dd>Stride along each spatial axis. If not present, the stride defaults to 1 along each spatial axis.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimensions are in the form of (N x C x D1 x D2 ... Dn), where N is the batch size. Optionally, if dimension denotation is in effect, the operation expects the input data tensor to arrive with the dimension denotation of [DATA_BATCH, DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE ...].</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>Output data tensor from average or max pooling across the input tensor. Dimensions will vary based on various kernel, stride, and pad sizes. Floor value of the dimension is used</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>averagepool_1d_default</summary>

```python
"""input_shape: [1, 3, 32]
output_shape: [1, 3, 31]
"""
node = onnx.helper.make_node(
    "AveragePool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[2],
)
x = np.random.randn(1, 3, 32).astype(np.float32)
x_shape = np.shape(x)
pads = None
kernel_shape = [2]
strides = [1]
out_shape, _ = get_output_shape_explicit_padding(
    pads, x_shape[2:], kernel_shape, strides
)
padded = x
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "AVG")

expect(node, inputs=[x], outputs=[y], name="test_averagepool_1d_default")
```

</details>


<details>
<summary>averagepool_2d_ceil</summary>

```python
"""input_shape: [1, 1, 4, 4]
output_shape: [1, 1, 2, 2]
"""
node = onnx.helper.make_node(
    "AveragePool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[3, 3],
    strides=[2, 2],
    ceil_mode=True,
)
x = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ]
).astype(np.float32)
y = np.array([[[[6, 7.5], [12, 13.5]]]]).astype(np.float32)

expect(node, inputs=[x], outputs=[y], name="test_averagepool_2d_ceil")
```

</details>


<details>
<summary>averagepool_2d_default</summary>

```python
"""input_shape: [1, 3, 32, 32]
output_shape: [1, 3, 31, 31]
"""
node = onnx.helper.make_node(
    "AveragePool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[2, 2],
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
x_shape = np.shape(x)
pads = None
kernel_shape = (2, 2)
strides = (1, 1)
out_shape, _ = get_output_shape_explicit_padding(
    pads, x_shape[2:], kernel_shape, strides
)
padded = x
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "AVG")

expect(node, inputs=[x], outputs=[y], name="test_averagepool_2d_default")
```

</details>


<details>
<summary>averagepool_2d_dilations</summary>

```python
"""input_shape: [1, 1, 4, 4]
output_shape: [1, 1, 2, 2]
"""
node = onnx.helper.make_node(
    "AveragePool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[2, 2],
    strides=[1, 1],
    dilations=[2, 2],
    ceil_mode=True,
)

# input shape: [1, 1, 4, 4]
x = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ]
).astype(np.float32)

y = np.array([[[[6, 7], [10, 11]]]]).astype(np.float32)

expect(node, inputs=[x], outputs=[y], name="test_averagepool_2d_dilations")
```

</details>


<details>
<summary>averagepool_2d_pads</summary>

```python
"""input_shape: [1, 3, 28, 28]
output_shape: [1, 3, 30, 30]
pad_shape: [4, 4] -> [2, 2, 2, 2] by axis
"""
node = onnx.helper.make_node(
    "AveragePool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[3, 3],
    pads=[2, 2, 2, 2],
)
x = np.random.randn(1, 3, 28, 28).astype(np.float32)
x_shape = np.shape(x)
kernel_shape = (3, 3)
strides = (1, 1)
pad_bottom = 2
pad_top = 2
pad_right = 2
pad_left = 2
pads = [pad_top, pad_left, pad_bottom, pad_right]
out_shape, pads = get_output_shape_explicit_padding(
    pads, x_shape[2:], kernel_shape, strides, ceil_mode=False
)
padded = np.pad(
    x,
    ((0, 0), (0, 0), (pads[0], pads[2]), (pads[1], pads[3])),
    mode="constant",
    constant_values=np.nan,
)
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "AVG", pads)

expect(node, inputs=[x], outputs=[y], name="test_averagepool_2d_pads")
```

</details>


<details>
<summary>averagepool_2d_pads_count_include_pad</summary>

```python
"""input_shape: [1, 3, 28, 28]
output_shape: [1, 3, 30, 30]
pad_shape: [4, 4] -> [2, 2, 2, 2] by axis
"""
node = onnx.helper.make_node(
    "AveragePool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[3, 3],
    pads=[2, 2, 2, 2],
    count_include_pad=1,
)
x = np.random.randn(1, 3, 28, 28).astype(np.float32)
x_shape = np.shape(x)
dilations = (1, 1)
kernel_shape = (3, 3)
strides = (1, 1)
pad_bottom = 2
pad_top = 2
pad_right = 2
pad_left = 2
pads = [pad_top, pad_left, pad_bottom, pad_right]
out_shape, pads = get_output_shape_explicit_padding(
    pads, x_shape[2:], kernel_shape, strides, dilations, ceil_mode=False
)
padded = np.pad(
    x,
    ((0, 0), (0, 0), (pads[0], pads[2]), (pads[1], pads[3])),
    mode="constant",
    constant_values=0,
)
y = pool(
    padded,
    x_shape,
    kernel_shape,
    strides,
    out_shape,
    "AVG",
    pads,
    count_include_pad=1,
)

expect(
    node,
    inputs=[x],
    outputs=[y],
    name="test_averagepool_2d_pads_count_include_pad",
)
```

</details>


<details>
<summary>averagepool_2d_precomputed_pads</summary>

```python
"""input_shape: [1, 1, 5, 5]
output_shape: [1, 1, 5, 5]
pad_shape: [4, 4] -> [2, 2, 2, 2] by axis
"""
node = onnx.helper.make_node(
    "AveragePool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[5, 5],
    pads=[2, 2, 2, 2],
)
x = np.array(
    [
        [
            [
                [1, 2, 3, 4, 5],
                [6, 7, 8, 9, 10],
                [11, 12, 13, 14, 15],
                [16, 17, 18, 19, 20],
                [21, 22, 23, 24, 25],
            ]
        ]
    ]
).astype(np.float32)
y = np.array(
    [
        [
            [
                [7, 7.5, 8, 8.5, 9],
                [9.5, 10, 10.5, 11, 11.5],
                [12, 12.5, 13, 13.5, 14],
                [14.5, 15, 15.5, 16, 16.5],
                [17, 17.5, 18, 18.5, 19],
            ]
        ]
    ]
).astype(np.float32)

expect(
    node, inputs=[x], outputs=[y], name="test_averagepool_2d_precomputed_pads"
)
```

</details>


<details>
<summary>averagepool_2d_precomputed_pads_count_include_pad</summary>

```python
"""input_shape: [1, 1, 5, 5]
output_shape: [1, 1, 5, 5]
pad_shape: [4, 4] -> [2, 2, 2, 2] by axis
"""
node = onnx.helper.make_node(
    "AveragePool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[5, 5],
    pads=[2, 2, 2, 2],
    count_include_pad=1,
)
x = np.array(
    [
        [
            [
                [1, 2, 3, 4, 5],
                [6, 7, 8, 9, 10],
                [11, 12, 13, 14, 15],
                [16, 17, 18, 19, 20],
                [21, 22, 23, 24, 25],
            ]
        ]
    ]
).astype(np.float32)
y = np.array(
    [
        [
            [
                [2.5200, 3.6000, 4.8000, 4.0800, 3.2400],
                [4.5600, 6.4000, 8.4000, 7.0400, 5.5200],
                [7.2000, 10.0000, 13.0000, 10.8000, 8.4000],
                [6.9600, 9.6000, 12.4000, 10.2400, 7.9200],
                [6.1200, 8.4000, 10.8000, 8.8800, 6.8400],
            ]
        ]
    ]
).astype(np.float32)

expect(
    node,
    inputs=[x],
    outputs=[y],
    name="test_averagepool_2d_precomputed_pads_count_include_pad",
)
```

</details>


<details>
<summary>averagepool_2d_precomputed_same_upper</summary>

```python
"""input_shape: [1, 1, 5, 5]
output_shape: [1, 1, 3, 3]
pad_shape: [2, 2] -> [1, 1, 1, 1] by axis
"""
node = onnx.helper.make_node(
    "AveragePool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[3, 3],
    strides=[2, 2],
    auto_pad="SAME_UPPER",
)
x = np.array(
    [
        [
            [
                [1, 2, 3, 4, 5],
                [6, 7, 8, 9, 10],
                [11, 12, 13, 14, 15],
                [16, 17, 18, 19, 20],
                [21, 22, 23, 24, 25],
            ]
        ]
    ]
).astype(np.float32)
y = np.array([[[[4, 5.5, 7], [11.5, 13, 14.5], [19, 20.5, 22]]]]).astype(
    np.float32
)

expect(
    node,
    inputs=[x],
    outputs=[y],
    name="test_averagepool_2d_precomputed_same_upper",
)
```

</details>


<details>
<summary>averagepool_2d_precomputed_strides</summary>

```python
"""input_shape: [1, 1, 5, 5]
output_shape: [1, 1, 2, 2]
"""
node = onnx.helper.make_node(
    "AveragePool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[2, 2],
    strides=[2, 2],
)
x = np.array(
    [
        [
            [
                [1, 2, 3, 4, 5],
                [6, 7, 8, 9, 10],
                [11, 12, 13, 14, 15],
                [16, 17, 18, 19, 20],
                [21, 22, 23, 24, 25],
            ]
        ]
    ]
).astype(np.float32)
y = np.array([[[[4, 6], [14, 16]]]]).astype(np.float32)

expect(
    node,
    inputs=[x],
    outputs=[y],
    name="test_averagepool_2d_precomputed_strides",
)
```

</details>


<details>
<summary>averagepool_2d_same_lower</summary>

```python
"""input_shape: [1, 3, 32, 32]
output_shape: [1, 3, 32, 32]
pad_shape: [1, 1] -> [1, 0, 1, 0] by axis
"""
node = onnx.helper.make_node(
    "AveragePool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[2, 2],
    auto_pad="SAME_LOWER",
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
x_shape = np.shape(x)
kernel_shape = (2, 2)
strides = (1, 1)
out_shape = get_output_shape_auto_pad(
    "SAME_LOWER", x_shape[2:], kernel_shape, strides
)
pad_shape = get_pad_shape(
    "SAME_LOWER", x_shape[2:], kernel_shape, strides, out_shape
)
pad_bottom = pad_shape[0] // 2
pad_top = pad_shape[0] - pad_bottom
pad_right = pad_shape[1] // 2
pad_left = pad_shape[1] - pad_right
padded = np.pad(
    x,
    ((0, 0), (0, 0), (pad_top, pad_bottom), (pad_left, pad_right)),
    mode="constant",
    constant_values=np.nan,
)
pads = (pad_top, pad_left, pad_bottom, pad_right)
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "AVG", pads)

expect(node, inputs=[x], outputs=[y], name="test_averagepool_2d_same_lower")
```

</details>


<details>
<summary>averagepool_2d_same_upper</summary>

```python
"""input_shape: [1, 3, 32, 32]
output_shape: [1, 3, 32, 32]
pad_shape: [1, 1] -> [0, 1, 0, 1] by axis
"""
node = onnx.helper.make_node(
    "AveragePool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[2, 2],
    auto_pad="SAME_UPPER",
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
x_shape = np.shape(x)
kernel_shape = (2, 2)
strides = (1, 1)
out_shape = get_output_shape_auto_pad(
    "SAME_UPPER", x_shape[2:], kernel_shape, strides
)
pad_shape = get_pad_shape(
    "SAME_UPPER", x_shape[2:], kernel_shape, strides, out_shape
)
pad_top = pad_shape[0] // 2
pad_bottom = pad_shape[0] - pad_top
pad_left = pad_shape[1] // 2
pad_right = pad_shape[1] - pad_left
padded = np.pad(
    x,
    ((0, 0), (0, 0), (pad_top, pad_bottom), (pad_left, pad_right)),
    mode="constant",
    constant_values=np.nan,
)
pads = (pad_top, pad_left, pad_bottom, pad_right)
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "AVG", pads)

expect(node, inputs=[x], outputs=[y], name="test_averagepool_2d_same_upper")
```

</details>


<details>
<summary>averagepool_2d_strides</summary>

```python
"""input_shape: [1, 3, 32, 32]
output_shape: [1, 3, 10, 10]
"""
node = onnx.helper.make_node(
    "AveragePool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[5, 5],
    strides=[3, 3],
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
x_shape = np.shape(x)
kernel_shape = (5, 5)
strides = (3, 3)
out_shape, pads = get_output_shape_explicit_padding(
    None, x_shape[2:], kernel_shape, strides, ceil_mode=False
)
padded = x
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "AVG", pads)

expect(node, inputs=[x], outputs=[y], name="test_averagepool_2d_strides")
```

</details>


<details>
<summary>averagepool_3d_default</summary>

```python
"""input_shape: [1, 3, 32, 32, 32]
output_shape: [1, 3, 31, 31, 31]
"""
node = onnx.helper.make_node(
    "AveragePool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[2, 2, 2],
)
x = np.random.randn(1, 3, 32, 32, 32).astype(np.float32)
x_shape = np.shape(x)
pads = None
kernel_shape = [2, 2, 2]
strides = [1, 1, 1]
out_shape, _ = get_output_shape_explicit_padding(
    pads, x_shape[2:], kernel_shape, strides
)
padded = x
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "AVG")

expect(node, inputs=[x], outputs=[y], name="test_averagepool_3d_default")
```

</details>


<details>
<summary>averagepool_3d_dilations</summary>

```python
"""input_shape: [1, 1, 4, 4]
output_shape: [1, 1, 2, 2]
"""
node = onnx.helper.make_node(
    "AveragePool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[2, 2, 2],
    strides=[1, 1, 1],
    dilations=[2, 2, 2],
    ceil_mode=True,
)

# input shape: [1, 1, 4, 4, 4]
x = np.array(
    [
        [
            [
                [
                    [1, 2, 3, 4],
                    [5, 6, 7, 8],
                    [9, 10, 11, 12],
                    [13, 14, 15, 16],
                ],
                [
                    [1, 2, 3, 4],
                    [5, 6, 7, 8],
                    [9, 10, 11, 12],
                    [13, 14, 15, 16],
                ],
                [
                    [1, 2, 3, 4],
                    [5, 6, 7, 8],
                    [9, 10, 11, 12],
                    [13, 14, 15, 16],
                ],
                [
                    [1, 2, 3, 4],
                    [5, 6, 7, 8],
                    [9, 10, 11, 12],
                    [13, 14, 15, 16],
                ],
            ]
        ]
    ]
).astype(np.float32)

y = np.array([[[[[6, 7], [10, 11]], [[6, 7], [10, 11]]]]]).astype(np.float32)

expect(
    node, inputs=[x], outputs=[y], name="test_averagepool_3d_dilations_small"
)
```

</details>


<details>
<summary>averagepool_3d_dilations_large</summary>

```python
x_shape = (32, 32, 32)
dilations = (2, 2, 2)
kernel_shape = (5, 5, 5)
strides = (3, 3, 3)
count_include_pad = 0

for count_include_pad in (0, 1):
    for ceil_mode in (True, False):
        node = onnx.helper.make_node(
            "AveragePool",
            inputs=["x"],
            outputs=["y"],
            kernel_shape=kernel_shape,
            strides=strides,
            dilations=dilations,
            count_include_pad=count_include_pad,
            ceil_mode=ceil_mode,
        )

        x = np.random.randn(1, 1, *x_shape).astype(np.float32)
        out_shape, pads = get_output_shape_explicit_padding(
            None,
            x_shape,
            kernel_shape,
            strides,
            dilations=dilations,
            ceil_mode=ceil_mode,
        )
        padded = np.pad(
            x,
            (
                (0, 0),
                (0, 0),
                (pads[0], pads[3]),
                (pads[1], pads[4]),
                (pads[2], pads[5]),
            ),
            mode="constant",
            constant_values=0 if count_include_pad == 1 else np.nan,
        )
        y = pool(
            padded,
            (1, 1, *x_shape),
            kernel_shape,
            strides,
            out_shape,
            "AVG",
            pads=pads,
            dilations=dilations,
            count_include_pad=count_include_pad,
        )

        test_name = f"test_averagepool_3d_dilations_large_count_include_pad_is_{count_include_pad}_ceil_mode_is_{ceil_mode}"
        expect(node, inputs=[x], outputs=[y], name=test_name)
```

</details>


### <a name="BatchNormalization"></a><a name="batchnormalization">**BatchNormalization**</a>

  Carries out batch normalization as described in the paper
  https://arxiv.org/abs/1502.03167. Depending on the mode it is being run,
  There are five required inputs 'X', 'scale', 'B', 'input_mean' and
  'input_var'.
  Note that 'input_mean' and 'input_var' are expected to be the estimated
  statistics in inference mode (training_mode=False, default),
  and the running statistics in training mode (training_mode=True).
  There are multiple cases for the number of outputs, which we list below:

  * Output case #1: Y, running_mean, running_var (training_mode=True)
  * Output case #2: Y (training_mode=False)

  When training_mode=False, extra outputs are invalid.
  The outputs are updated as follows when training_mode=True:
  ```
  running_mean = input_mean * momentum + current_mean * (1 - momentum)
  running_var = input_var * momentum + current_var * (1 - momentum)

  Y = (X - current_mean) / sqrt(current_var + epsilon) * scale + B
  ```
  where:
  ```
  current_mean = ReduceMean(X, axis=all_except_channel_index)
  current_var =  ReduceVar(X, axis=all_except_channel_index)
  ```
  Notice that `ReduceVar` refers to the population variance, and it equals to
  `sum(sqrd(x_i - x_avg)) / N`
  where `N` is the population size (this formula does not use sample size `N - 1`).

  The computation of ReduceMean and ReduceVar uses float to avoid overflow for float16 inputs.

  When training_mode=False:
  ```
  Y = (X - input_mean) / sqrt(input_var + epsilon) * scale + B
  ```

  For previous (depreciated) non-spatial cases, implementors are suggested
  to flatten the input shape to (N x C * D1 * D2 * ... * Dn) before a BatchNormalization Op.
  This operator has **optional** inputs/outputs. See [the doc](IR.md) for more details about the representation of optional arguments. An empty string may be used in the place of an actual argument's name to indicate a missing argument. Trailing optional arguments (those not followed by an argument that is present) may also be simply omitted.

#### Version

This version of the operator has been available since version 15 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#BatchNormalization-1">1</a>, <a href="Changelog.md#BatchNormalization-6">6</a>, <a href="Changelog.md#BatchNormalization-7">7</a>, <a href="Changelog.md#BatchNormalization-9">9</a>, <a href="Changelog.md#BatchNormalization-14">14</a>

#### Attributes

<dl>
<dt><tt>epsilon</tt> : float (default is 1e-05)</dt>
<dd>The epsilon value to use to avoid division by zero.</dd>
<dt><tt>momentum</tt> : float (default is 0.9)</dt>
<dd>Factor used in computing the running mean and variance.e.g., running_mean = running_mean * momentum + mean * (1 - momentum).</dd>
<dt><tt>training_mode</tt> : int (default is 0)</dt>
<dd>If set to true, it indicates BatchNormalization is being used for training, and outputs 1 and 2 are to be computed.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input data tensor from the previous operator; dimensions are in the form of (N x C x D1 x D2 ... Dn), where N is the batch size, C is the number of channels. Statistics are computed for every channel of C over N and D1 to Dn dimensions. For image data, input dimensions become (N x C x H x W). The op also accepts single dimension input of size N in which case C is assumed to be 1</dd>
<dt><tt>scale</tt> (differentiable) : T1</dt>
<dd>Scale tensor of shape (C).</dd>
<dt><tt>B</tt> (differentiable) : T1</dt>
<dd>Bias tensor of shape (C).</dd>
<dt><tt>input_mean</tt> (differentiable) : T2</dt>
<dd>running (training) or estimated (testing) mean tensor of shape (C).</dd>
<dt><tt>input_var</tt> (differentiable) : T2</dt>
<dd>running (training) or estimated (testing) variance tensor of shape (C).</dd>
</dl>

#### Outputs (1 - 3)

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>The output tensor of the same shape as X</dd>
<dt><tt>running_mean</tt> (optional, non-differentiable) : T2</dt>
<dd>The running mean after the BatchNormalization operator.</dd>
<dt><tt>running_var</tt> (optional, non-differentiable) : T2</dt>
<dd>The running variance after the BatchNormalization operator. This op uses the population size (N) for calculating variance, and not the sample size N-1.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to float tensors.</dd>
<dt><tt>T1</tt> : tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain scale and bias types to float tensors.</dd>
<dt><tt>T2</tt> : tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain mean and variance types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>batchnormalization</summary>

```python
# input size: (2, 3, 4, 5)
x = np.random.randn(2, 3, 4, 5).astype(np.float32)
s = np.random.randn(3).astype(np.float32)
bias = np.random.randn(3).astype(np.float32)
mean = np.random.randn(3).astype(np.float32)
var = np.random.rand(3).astype(np.float32)
y = _batchnorm_test_mode(x, s, bias, mean, var).astype(np.float32)

node = onnx.helper.make_node(
    "BatchNormalization",
    inputs=["x", "s", "bias", "mean", "var"],
    outputs=["y"],
)

# output size: (2, 3, 4, 5)
expect(
    node,
    inputs=[x, s, bias, mean, var],
    outputs=[y],
    name="test_batchnorm_example",
)

# input size: (2, 3, 4, 5)
x = np.random.randn(2, 3, 4, 5).astype(np.float32)
s = np.random.randn(3).astype(np.float32)
bias = np.random.randn(3).astype(np.float32)
mean = np.random.randn(3).astype(np.float32)
var = np.random.rand(3).astype(np.float32)
epsilon = 1e-2
y = _batchnorm_test_mode(x, s, bias, mean, var, epsilon).astype(np.float32)

node = onnx.helper.make_node(
    "BatchNormalization",
    inputs=["x", "s", "bias", "mean", "var"],
    outputs=["y"],
    epsilon=epsilon,
)

# output size: (2, 3, 4, 5)
expect(
    node,
    inputs=[x, s, bias, mean, var],
    outputs=[y],
    name="test_batchnorm_epsilon",
)
```

</details>


<details>
<summary>train</summary>

```python
# input size: (2, 3, 4, 5)
x = np.random.randn(2, 3, 4, 5).astype(np.float32)
s = np.random.randn(3).astype(np.float32)
bias = np.random.randn(3).astype(np.float32)
mean = np.random.randn(3).astype(np.float32)
var = np.random.rand(3).astype(np.float32)
# using np.bool(1) while generating test data with "'bool' object has no attribute 'dtype'"
# working around by using np.byte(1).astype(bool)
training_mode = 1
y, output_mean, output_var = _batchnorm_training_mode(x, s, bias, mean, var)

node = onnx.helper.make_node(
    "BatchNormalization",
    inputs=["x", "s", "bias", "mean", "var"],
    outputs=["y", "output_mean", "output_var"],
    training_mode=training_mode,
)

# output size: (2, 3, 4, 5)
expect(
    node,
    inputs=[x, s, bias, mean, var],
    outputs=[y, output_mean, output_var],
    name="test_batchnorm_example_training_mode",
)

# input size: (2, 3, 4, 5)
x = np.random.randn(2, 3, 4, 5).astype(np.float32)
s = np.random.randn(3).astype(np.float32)
bias = np.random.randn(3).astype(np.float32)
mean = np.random.randn(3).astype(np.float32)
var = np.random.rand(3).astype(np.float32)
training_mode = 1
momentum = 0.9
epsilon = 1e-2
y, output_mean, output_var = _batchnorm_training_mode(
    x, s, bias, mean, var, momentum, epsilon
)

node = onnx.helper.make_node(
    "BatchNormalization",
    inputs=["x", "s", "bias", "mean", "var"],
    outputs=["y", "output_mean", "output_var"],
    epsilon=epsilon,
    training_mode=training_mode,
)

# output size: (2, 3, 4, 5)
expect(
    node,
    inputs=[x, s, bias, mean, var],
    outputs=[y, output_mean, output_var],
    name="test_batchnorm_epsilon_training_mode",
)
```

</details>


### <a name="Bernoulli"></a><a name="bernoulli">**Bernoulli**</a>

  Draws binary random numbers (0 or 1) from a Bernoulli distribution. The input tensor should be a tensor
  containing probabilities p (a value in the range [0,1]) to be used for drawing the binary random number,
  where an output of 1 is produced with probability p and an output of 0 is produced with probability (1-p).

  This operator is non-deterministic and may not produce the same values in different
  implementations (even if a seed is specified).

#### Version

This version of the operator has been available since version 15 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>dtype</tt> : int</dt>
<dd>The data type for the elements of the output tensor. if not specified, we will use the data type of the input tensor.</dd>
<dt><tt>seed</tt> : float</dt>
<dd>(Optional) Seed to the random generator, if not specified we will auto generate one.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T1</dt>
<dd>All values in input have to be in the range:[0, 1].</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T2</dt>
<dd>The returned output tensor only has values 0 or 1, same shape as input tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input types to float tensors.</dd>
<dt><tt>T2</tt> : tensor(float16), tensor(float), tensor(double), tensor(bfloat16), tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bool)</dt>
<dd>Constrain output types to all numeric tensors and bool tensors.</dd>
</dl>


#### Examples

<details>
<summary>bernoulli_with_dtype</summary>

```python
node = onnx.helper.make_node(
    "Bernoulli",
    inputs=["x"],
    outputs=["y"],
    dtype=onnx.TensorProto.DOUBLE,
)

x = np.random.uniform(0.0, 1.0, 10).astype(np.float32)
y = bernoulli_reference_implementation(x, float)
expect(node, inputs=[x], outputs=[y], name="test_bernoulli_double")
```

</details>


<details>
<summary>bernoulli_with_seed</summary>

```python
seed = float(0)
node = onnx.helper.make_node(
    "Bernoulli",
    inputs=["x"],
    outputs=["y"],
    seed=seed,
)

x = np.random.uniform(0.0, 1.0, 10).astype(np.float32)
y = bernoulli_reference_implementation(x, np.float32)
expect(node, inputs=[x], outputs=[y], name="test_bernoulli_seed")
```

</details>


<details>
<summary>bernoulli_without_dtype</summary>

```python
node = onnx.helper.make_node(
    "Bernoulli",
    inputs=["x"],
    outputs=["y"],
)

x = np.random.uniform(0.0, 1.0, 10).astype(float)
y = bernoulli_reference_implementation(x, float)
expect(node, inputs=[x], outputs=[y], name="test_bernoulli")
```

</details>


### <a name="BitShift"></a><a name="bitshift">**BitShift**</a>

  Bitwise shift operator performs element-wise operation. For each input element, if the
  attribute "direction" is "RIGHT", this operator moves its binary representation toward
  the right side so that the input value is effectively decreased. If the attribute "direction"
  is "LEFT", bits of binary representation moves toward the left side, which results the
  increase of its actual value. The input X is the tensor to be shifted and another input
  Y specifies the amounts of shifting. For example, if "direction" is "Right", X is [1, 4],
  and S is [1, 1], the corresponding output Z would be [0, 2]. If "direction" is "LEFT" with
  X=[1, 2] and S=[1, 2], the corresponding output Y would be [2, 8].

  Because this operator supports Numpy-style broadcasting, X's and Y's shapes are
  not necessarily identical.
  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 11 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>direction</tt> : string (required)</dt>
<dd>Direction of moving bits. It can be either "RIGHT" (for right shift) or "LEFT" (for left shift).</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> (non-differentiable) : T</dt>
<dd>First operand, input to be shifted.</dd>
<dt><tt>Y</tt> (non-differentiable) : T</dt>
<dd>Second operand, amounts of shift.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Z</tt> (non-differentiable) : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64)</dt>
<dd>Constrain input and output types to integer tensors.</dd>
</dl>


#### Examples

<details>
<summary>left_unit16</summary>

```python
node = onnx.helper.make_node(
    "BitShift", inputs=["x", "y"], outputs=["z"], direction="LEFT"
)

x = np.array([16, 4, 1]).astype(np.uint16)
y = np.array([1, 2, 3]).astype(np.uint16)
z = x << y  # expected output [32, 16, 8]
expect(node, inputs=[x, y], outputs=[z], name="test_bitshift_left_uint16")
```

</details>


<details>
<summary>left_unit32</summary>

```python
node = onnx.helper.make_node(
    "BitShift", inputs=["x", "y"], outputs=["z"], direction="LEFT"
)

x = np.array([16, 4, 1]).astype(np.uint32)
y = np.array([1, 2, 3]).astype(np.uint32)
z = x << y  # expected output [32, 16, 8]
expect(node, inputs=[x, y], outputs=[z], name="test_bitshift_left_uint32")
```

</details>


<details>
<summary>left_unit64</summary>

```python
node = onnx.helper.make_node(
    "BitShift", inputs=["x", "y"], outputs=["z"], direction="LEFT"
)

x = np.array([16, 4, 1]).astype(np.uint64)
y = np.array([1, 2, 3]).astype(np.uint64)
z = x << y  # expected output [32, 16, 8]
expect(node, inputs=[x, y], outputs=[z], name="test_bitshift_left_uint64")
```

</details>


<details>
<summary>left_unit8</summary>

```python
node = onnx.helper.make_node(
    "BitShift", inputs=["x", "y"], outputs=["z"], direction="LEFT"
)

x = np.array([16, 4, 1]).astype(np.uint8)
y = np.array([1, 2, 3]).astype(np.uint8)
z = x << y  # expected output [32, 16, 8]
expect(node, inputs=[x, y], outputs=[z], name="test_bitshift_left_uint8")
```

</details>


<details>
<summary>right_unit16</summary>

```python
node = onnx.helper.make_node(
    "BitShift", inputs=["x", "y"], outputs=["z"], direction="RIGHT"
)

x = np.array([16, 4, 1]).astype(np.uint16)
y = np.array([1, 2, 3]).astype(np.uint16)
z = x >> y  # expected output [8, 1, 0]
expect(node, inputs=[x, y], outputs=[z], name="test_bitshift_right_uint16")
```

</details>


<details>
<summary>right_unit32</summary>

```python
node = onnx.helper.make_node(
    "BitShift", inputs=["x", "y"], outputs=["z"], direction="RIGHT"
)

x = np.array([16, 4, 1]).astype(np.uint32)
y = np.array([1, 2, 3]).astype(np.uint32)
z = x >> y  # expected output [8, 1, 0]
expect(node, inputs=[x, y], outputs=[z], name="test_bitshift_right_uint32")
```

</details>


<details>
<summary>right_unit64</summary>

```python
node = onnx.helper.make_node(
    "BitShift", inputs=["x", "y"], outputs=["z"], direction="RIGHT"
)

x = np.array([16, 4, 1]).astype(np.uint64)
y = np.array([1, 2, 3]).astype(np.uint64)
z = x >> y  # expected output [8, 1, 0]
expect(node, inputs=[x, y], outputs=[z], name="test_bitshift_right_uint64")
```

</details>


<details>
<summary>right_unit8</summary>

```python
node = onnx.helper.make_node(
    "BitShift", inputs=["x", "y"], outputs=["z"], direction="RIGHT"
)

x = np.array([16, 4, 1]).astype(np.uint8)
y = np.array([1, 2, 3]).astype(np.uint8)
z = x >> y  # expected output [8, 1, 0]
expect(node, inputs=[x, y], outputs=[z], name="test_bitshift_right_uint8")
```

</details>


### <a name="BitwiseAnd"></a><a name="bitwiseand">**BitwiseAnd**</a>

  Returns the tensor resulting from performing the bitwise `and` operation
  elementwise on the input tensors `A` and `B` (with Numpy-style broadcasting support).

  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 18 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>A</tt> (non-differentiable) : T</dt>
<dd>First input operand for the bitwise operator.</dd>
<dt><tt>B</tt> (non-differentiable) : T</dt>
<dd>Second input operand for the bitwise operator.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> (non-differentiable) : T</dt>
<dd>Result tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64)</dt>
<dd>Constrain input to integer tensors.</dd>
</dl>


#### Examples

<details>
<summary>bitwiseand</summary>

```python
node = onnx.helper.make_node(
    "BitwiseAnd",
    inputs=["x", "y"],
    outputs=["bitwiseand"],
)

# 2d
x = create_random_int((3, 4), np.int32)
y = create_random_int((3, 4), np.int32)
z = np.bitwise_and(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_bitwise_and_i32_2d")

# 3d
x = create_random_int((3, 4, 5), np.int16)
y = create_random_int((3, 4, 5), np.int16)
z = np.bitwise_and(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_bitwise_and_i16_3d")
```

</details>


<details>
<summary>bitwiseand_broadcast</summary>

```python
node = onnx.helper.make_node(
    "BitwiseAnd",
    inputs=["x", "y"],
    outputs=["bitwiseand"],
)

# 3d vs 1d
x = create_random_int((3, 4, 5), np.uint64)
y = create_random_int((5,), np.uint64)
z = np.bitwise_and(x, y)
expect(
    node, inputs=[x, y], outputs=[z], name="test_bitwise_and_ui64_bcast_3v1d"
)

# 4d vs 3d
x = create_random_int((3, 4, 5, 6), np.uint8)
y = create_random_int((4, 5, 6), np.uint8)
z = np.bitwise_and(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_bitwise_and_ui8_bcast_4v3d")
```

</details>


### <a name="BitwiseNot"></a><a name="bitwisenot">**BitwiseNot**</a>

  Returns the bitwise not of the input tensor element-wise.

#### Version

This version of the operator has been available since version 18 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>X</tt> (non-differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (non-differentiable) : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64)</dt>
<dd>Constrain input/output to integer tensors.</dd>
</dl>


#### Examples

<details>
<summary>bitwisenot</summary>

```python
node = onnx.helper.make_node(
    "BitwiseNot",
    inputs=["x"],
    outputs=["bitwise_not"],
)

# 2d
x = create_random_int((3, 4), np.int32)
y = np.bitwise_not(x)
expect(node, inputs=[x], outputs=[y], name="test_bitwise_not_2d")

# 3d
x = create_random_int((3, 4, 5), np.uint16)
y = np.bitwise_not(x)
expect(node, inputs=[x], outputs=[y], name="test_bitwise_not_3d")

# 4d
x = create_random_int((3, 4, 5, 6), np.uint8)
y = np.bitwise_not(x)
expect(node, inputs=[x], outputs=[y], name="test_bitwise_not_4d")
```

</details>


### <a name="BitwiseOr"></a><a name="bitwiseor">**BitwiseOr**</a>

  Returns the tensor resulting from performing the bitwise `or` operation
  elementwise on the input tensors `A` and `B` (with Numpy-style broadcasting support).

  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 18 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>A</tt> (non-differentiable) : T</dt>
<dd>First input operand for the bitwise operator.</dd>
<dt><tt>B</tt> (non-differentiable) : T</dt>
<dd>Second input operand for the bitwise operator.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> (non-differentiable) : T</dt>
<dd>Result tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64)</dt>
<dd>Constrain input to integer tensors.</dd>
</dl>


#### Examples

<details>
<summary>bitwiseor</summary>

```python
node = onnx.helper.make_node(
    "BitwiseOr",
    inputs=["x", "y"],
    outputs=["bitwiseor"],
)
# 2d
x = create_random_int((3, 4), np.int32)
y = create_random_int((3, 4), np.int32)
z = np.bitwise_or(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_bitwise_or_i32_2d")

# 4d
x = create_random_int((3, 4, 5, 6), np.int8)
y = create_random_int((3, 4, 5, 6), np.int8)
z = np.bitwise_or(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_bitwise_or_i16_4d")
```

</details>


<details>
<summary>bitwiseor_broadcast</summary>

```python
node = onnx.helper.make_node(
    "BitwiseOr",
    inputs=["x", "y"],
    outputs=["bitwiseor"],
)

# 3d vs 1d
x = create_random_int((3, 4, 5), np.uint64)
y = create_random_int((5,), np.uint64)
z = np.bitwise_or(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_bitwise_or_ui64_bcast_3v1d")

# 4d vs 3d
x = create_random_int((3, 4, 5, 6), np.uint8)
y = create_random_int((4, 5, 6), np.uint8)
z = np.bitwise_or(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_bitwise_or_ui8_bcast_4v3d")
```

</details>


### <a name="BitwiseXor"></a><a name="bitwisexor">**BitwiseXor**</a>

  Returns the tensor resulting from performing the bitwise `xor` operation
  elementwise on the input tensors `A` and `B` (with Numpy-style broadcasting support).

  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 18 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>A</tt> (non-differentiable) : T</dt>
<dd>First input operand for the bitwise operator.</dd>
<dt><tt>B</tt> (non-differentiable) : T</dt>
<dd>Second input operand for the bitwise operator.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> (non-differentiable) : T</dt>
<dd>Result tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64)</dt>
<dd>Constrain input to integer tensors.</dd>
</dl>


#### Examples

<details>
<summary>bitwiseor_broadcast</summary>

```python
node = onnx.helper.make_node(
    "BitwiseXor",
    inputs=["x", "y"],
    outputs=["bitwisexor"],
)

# 3d vs 1d
x = create_random_int((3, 4, 5), np.uint64)
y = create_random_int((5,), np.uint64)
z = np.bitwise_xor(x, y)
expect(
    node, inputs=[x, y], outputs=[z], name="test_bitwise_xor_ui64_bcast_3v1d"
)

# 4d vs 3d
x = create_random_int((3, 4, 5, 6), np.uint8)
y = create_random_int((4, 5, 6), np.uint8)
z = np.bitwise_xor(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_bitwise_xor_ui8_bcast_4v3d")
```

</details>


<details>
<summary>bitwisexor</summary>

```python
node = onnx.helper.make_node(
    "BitwiseXor",
    inputs=["x", "y"],
    outputs=["bitwisexor"],
)

# 2d
x = create_random_int((3, 4), np.int32)
y = create_random_int((3, 4), np.int32)
z = np.bitwise_xor(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_bitwise_xor_i32_2d")

# 3d
x = create_random_int((3, 4, 5), np.int16)
y = create_random_int((3, 4, 5), np.int16)
z = np.bitwise_xor(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_bitwise_xor_i16_3d")
```

</details>


### <a name="BlackmanWindow"></a><a name="blackmanwindow">**BlackmanWindow**</a>

  Generates a Blackman window as described in the paper https://ieeexplore.ieee.org/document/1455106.

#### Version

This version of the operator has been available since version 17 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>output_datatype</tt> : int (default is 1)</dt>
<dd>The data type of the output tensor. Strictly must be one of the values from DataType enum in TensorProto whose values correspond to T2. The default value is 1 = FLOAT. </dd>
<dt><tt>periodic</tt> : int (default is 1)</dt>
<dd>If 1, returns a window to be used as periodic function. If 0, return a symmetric window. When 'periodic' is specified, hann computes a window of length size + 1 and returns the first size points. The default value is 1. </dd>
</dl>

#### Inputs

<dl>
<dt><tt>size</tt> (non-differentiable) : T1</dt>
<dd>A scalar value indicating the length of the window.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (non-differentiable) : T2</dt>
<dd>A Blackman window with length: size. The output has the shape: [size].</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(int32), tensor(int64)</dt>
<dd>Constrain the input size to int64_t.</dd>
<dt><tt>T2</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain output types to numeric tensors.</dd>
</dl>


#### Examples

<details>
<summary>blackmanwindow</summary>

```python
# Test periodic window
node = onnx.helper.make_node(
    "BlackmanWindow",
    inputs=["x"],
    outputs=["y"],
)
size = np.int32(10)
a0 = 0.42
a1 = -0.5
a2 = 0.08
y = a0
y += a1 * np.cos(2 * np.pi * np.arange(0, size, 1, dtype=np.float32) / size)
y += a2 * np.cos(4 * np.pi * np.arange(0, size, 1, dtype=np.float32) / size)
expect(node, inputs=[size], outputs=[y], name="test_blackmanwindow")

# Test symmetric window
node = onnx.helper.make_node(
    "BlackmanWindow", inputs=["x"], outputs=["y"], periodic=0
)
size = np.int32(10)
a0 = 0.42
a1 = -0.5
a2 = 0.08
y = a0
y += a1 * np.cos(
    2 * np.pi * np.arange(0, size, 1, dtype=np.float32) / (size - 1)
)
y += a2 * np.cos(
    4 * np.pi * np.arange(0, size, 1, dtype=np.float32) / (size - 1)
)
expect(node, inputs=[size], outputs=[y], name="test_blackmanwindow_symmetric")
```

</details>


### <a name="Cast"></a><a name="cast">**Cast**</a>

  The operator casts the elements of a given input tensor to a data type
  specified by the 'to' argument and returns an output tensor of the same size in
  the converted type. The 'to' argument must be one of the data types specified
  in the 'DataType' enum field in the TensorProto message.

  Casting from string tensor in plain (e.g., "3.14" and "1000") and scientific numeric representations
  (e.g., "1e-5" and "1E8") to float types is supported. For example, converting string "100.5" to an integer may
  yield result 100. There are some string literals reserved for special floating-point values;
  "+INF" (and "INF"), "-INF", and "NaN" are positive infinity, negative infinity, and not-a-number, respectively.
  Any string which can exactly match "+INF" in a case-insensitive way would be mapped to positive infinite. Similarly,
  this case-insensitive rule is applied to "INF" and "NaN". When casting from numeric tensors
  to string tensors, plain floating-point representation (such as "314.15926") would be used.
  Converting non-numerical-literal string such as "Hello World!" is an undefined behavior. Cases
  of converting string representing floating-point arithmetic value, such as "2.718", to INT is an undefined behavior.

  Conversion from a numerical type to any numerical type is always allowed.
  User must be aware of precision loss and value change caused by range difference between two types.
  For example, a 64-bit float 3.1415926459 may be round to a 32-bit float 3.141592. Similarly, converting
  an integer 36 to Boolean may produce 1 because we truncate bits which can't be stored in the targeted type.

  In more detail, the conversion among numerical types should follow these rules
  if the destination type is not a float 8 type.

  * Casting from floating point to:
    * floating point: +/- infinity if OOR (out of range).
    * fixed point: undefined if OOR.
    * bool: +/- 0.0 to False; all else to True.
  * Casting from fixed point to:
    * floating point: +/- infinity if OOR. (+ infinity in the case of uint)
    * fixed point: when OOR, discard higher bits and reinterpret (with respect to two's complement representation for
      signed types). For example, 200 (int16) -> -56 (int8).
    * bool: zero to False; nonzero to True.
  * Casting from bool to:
    * floating point: `{1.0, 0.0}`.
    * fixed point: `{1, 0}`.
    * bool: no change.

  Float 8 type were introduced to speed up the training of
  deep models. By default the conversion of a float *x* obeys
  to the following rules. `[x]` means the value rounded to
  the target mantissa width.

  | x | E4M3FN | E4M3FNUZ | E5M2 | E5M2FNUZ |
  |------|----|----|----|----|
  | 0 | 0 | 0 | 0 | 0 |
  |-0 | -0 | 0 | -0 | 0 |
  | NaN | NaN | NaN | NaN | NaN |
  | +/- Inf | +/- FLT_MAX | NaN | FLT_MAX | NaN |
  | [x] > FLT_MAX | FLT_MAX | FLT_MAX | FLT_MAX | FLT_MAX |
  | [x] < -FLT_MAX | -FLT_MAX | -FLT_MAX | -FLT_MAX | -FLT_MAX |
  | else | RNE | RNE | RNE | RNE |

  The behavior changes if the parameter 'saturate' is set to False.
  The rules then become:

  | x | E4M3FN | E4M3FNUZ | E5M2 | E5M2FNUZ |
  |------|----|----|----|----|
  | 0 | 0 | 0 | 0 | 0 |
  |-0 | -0 | 0 | -0 | 0 |
  | NaN | NaN | NaN | NaN | NaN |
  | +/- Inf | NaN | NaN | +/- Inf | NaN |
  | [x] > FLT_MAX | NaN | NaN | Inf | NaN |
  | [x] < -FLT_MAX | NaN | NaN | -Inf | NaN |
  | else | RNE | RNE | RNE | RNE |

#### Version

This version of the operator has been available since version 21 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Cast-1">1</a>, <a href="Changelog.md#Cast-6">6</a>, <a href="Changelog.md#Cast-9">9</a>, <a href="Changelog.md#Cast-13">13</a>, <a href="Changelog.md#Cast-19">19</a>

#### Attributes

<dl>
<dt><tt>saturate</tt> : int (default is 1)</dt>
<dd>The parameter defines how the conversion behaves if an input value is out of range of the destination type. It only applies for float 8 conversion (float8e4m3fn, float8e4m3fnuz, float8e5m2, float8e5m2fnuz). It is true by default. All cases are fully described in two tables inserted in the operator description.</dd>
<dt><tt>to</tt> : int (required)</dt>
<dd>The data type to which the elements of the input tensor are cast. Strictly must be one of the types from DataType enum in TensorProto</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T1</dt>
<dd>Input tensor to be cast.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T2</dt>
<dd>Output tensor with the same shape as input with type specified by the 'to' argument</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(float16), tensor(float), tensor(double), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(bool), tensor(string), tensor(bfloat16), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4)</dt>
<dd>Constrain input types. Casting from complex is not supported.</dd>
<dt><tt>T2</tt> : tensor(float16), tensor(float), tensor(double), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(bool), tensor(string), tensor(bfloat16), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4)</dt>
<dd>Constrain output types. Casting to complex is not supported.</dd>
</dl>


#### Examples

<details>
<summary>cast</summary>

```python
shape = (3, 4)
test_cases = [
    ("FLOAT", "FLOAT16"),
    ("FLOAT", "DOUBLE"),
    ("FLOAT16", "FLOAT"),
    ("FLOAT16", "DOUBLE"),
    ("DOUBLE", "FLOAT"),
    ("DOUBLE", "FLOAT16"),
    ("FLOAT", "STRING"),
    ("STRING", "FLOAT"),
    ("FLOAT", "BFLOAT16"),
    ("BFLOAT16", "FLOAT"),
    ("FLOAT", "FLOAT8E4M3FN"),
    ("FLOAT16", "FLOAT8E4M3FN"),
    ("FLOAT", "FLOAT8E4M3FNUZ"),
    ("FLOAT16", "FLOAT8E4M3FNUZ"),
    ("FLOAT8E4M3FN", "FLOAT"),
    ("FLOAT8E4M3FN", "FLOAT16"),
    ("FLOAT8E4M3FNUZ", "FLOAT"),
    ("FLOAT8E4M3FNUZ", "FLOAT16"),
    ("FLOAT", "FLOAT8E5M2"),
    ("FLOAT16", "FLOAT8E5M2"),
    ("FLOAT", "FLOAT8E5M2FNUZ"),
    ("FLOAT16", "FLOAT8E5M2FNUZ"),
    ("FLOAT8E5M2", "FLOAT"),
    ("FLOAT8E5M2", "FLOAT16"),
    ("FLOAT8E5M2FNUZ", "FLOAT"),
    ("FLOAT8E5M2FNUZ", "FLOAT16"),
    ("FLOAT", "UINT4"),
    ("FLOAT16", "UINT4"),
    ("FLOAT", "INT4"),
    ("FLOAT16", "INT4"),
    ("UINT4", "FLOAT"),
    ("UINT4", "FLOAT16"),
    ("UINT4", "UINT8"),
    ("INT4", "FLOAT"),
    ("INT4", "FLOAT16"),
    ("INT4", "INT8"),
]

vect_float32_to_float8e4m3 = np.vectorize(float32_to_float8e4m3)
vect_float32_to_float8e5m2 = np.vectorize(float32_to_float8e5m2)
vect_float32_to_uint4 = np.vectorize(
    lambda x: subbyte.float32_to_4bit_unpacked(x, signed=False)
)
vect_float32_to_int4 = np.vectorize(
    lambda x: subbyte.float32_to_4bit_unpacked(x, signed=True)
)

f8_types = ("FLOAT8E4M3FN", "FLOAT8E4M3FNUZ", "FLOAT8E5M2", "FLOAT8E5M2FNUZ")

for from_type, to_type in test_cases:
    input_type_proto = None
    output_type_proto = None
    if from_type == "BFLOAT16" or to_type == "BFLOAT16":
        np_fp32 = np.array(
            [
                "0.47892547",
                "0.48033667",
                "0.49968487",
                "0.81910545",
                "0.47031248",
                "0.816468",
                "0.21087195",
                "0.7229038",
                "NaN",
                "INF",
                "+INF",
                "-INF",
            ],
            dtype=np.float32,
        )
        little_endisan = sys.byteorder == "little"
        np_uint16_view = np_fp32.view(dtype=np.uint16)
        np_bfp16 = (
            np_uint16_view[1::2] if little_endisan else np_uint16_view[0::2]
        )
        if to_type == "BFLOAT16":
            assert from_type == "FLOAT"
            input = np_fp32.reshape([3, 4])
            output = np_bfp16.reshape([3, 4])
            input_type_proto = onnx.helper.make_tensor_type_proto(
                int(TensorProto.FLOAT), input.shape
            )
            output_type_proto = onnx.helper.make_tensor_type_proto(
                int(TensorProto.BFLOAT16), output.shape
            )
        else:
            assert to_type == "FLOAT"
            input = np_bfp16.reshape([3, 4])
            # convert bfloat to FLOAT
            np_fp32_zeros = np.zeros((len(np_bfp16) * 2,), dtype=np.uint16)
            if little_endisan:
                np_fp32_zeros[1::2] = np_bfp16
            else:
                np_fp32_zeros[0::2] = np_bfp16
            np_fp32_from_bfloat = np_fp32_zeros.view(dtype=np.float32)
            output = np_fp32_from_bfloat.reshape([3, 4])
            input_type_proto = onnx.helper.make_tensor_type_proto(
                int(TensorProto.BFLOAT16), input.shape
            )
            output_type_proto = onnx.helper.make_tensor_type_proto(
                int(TensorProto.FLOAT), output.shape
            )
    elif from_type in f8_types or to_type in f8_types:
        np_fp32 = np.array(
            [
                "0.47892547",
                "0.48033667",
                "0.49968487",
                "0.81910545",
                "0.47031248",
                "0.7229038",
                "1000000",
                "1e-7",
                "NaN",
                "INF",
                "+INF",
                "-INF",
                "-0.0000001",
                "0.0000001",
                "-1000000",
            ],
            dtype=np.float32,
        )

        if from_type == "FLOAT":
            input_values = np_fp32
            input = make_tensor(
                "x", TensorProto.FLOAT, [3, 5], np_fp32.tolist()
            )
        elif from_type == "FLOAT16":
            input_values = np_fp32.astype(np.float16).astype(np.float32)
            input = make_tensor(
                "x", TensorProto.FLOAT16, [3, 5], input_values.tolist()
            )
        elif from_type == "FLOAT8E4M3FN":
            input_values = float8e4m3_to_float32(
                vect_float32_to_float8e4m3(np_fp32)
            )
            input = make_tensor(
                "x", TensorProto.FLOAT8E4M3FN, [3, 5], input_values.tolist()
            )
        elif from_type == "FLOAT8E4M3FNUZ":
            input_values = float8e4m3_to_float32(
                vect_float32_to_float8e4m3(np_fp32, uz=True), uz=True
            )
            input = make_tensor(
                "x", TensorProto.FLOAT8E4M3FNUZ, [3, 5], input_values.tolist()
            )
        elif from_type == "FLOAT8E5M2":
            input_values = float8e5m2_to_float32(
                vect_float32_to_float8e5m2(np_fp32)
            )
            input = make_tensor(
                "x", TensorProto.FLOAT8E5M2, [3, 5], input_values.tolist()
            )
        elif from_type == "FLOAT8E5M2FNUZ":
            input_values = float8e5m2_to_float32(
                vect_float32_to_float8e5m2(np_fp32, fn=True, uz=True),
                fn=True,
                uz=True,
            )
            input = make_tensor(
                "x", TensorProto.FLOAT8E5M2FNUZ, [3, 5], input_values.tolist()
            )
        else:
            raise ValueError(
                "Conversion from {from_type} to {to_type} is not tested."
            )

        if to_type == "FLOAT8E4M3FN":
            expected = float8e4m3_to_float32(
                vect_float32_to_float8e4m3(input_values)
            )
        elif to_type == "FLOAT8E4M3FNUZ":
            expected = float8e4m3_to_float32(
                vect_float32_to_float8e4m3(input_values, uz=True), uz=True
            )
        elif to_type == "FLOAT8E5M2":
            expected = float8e5m2_to_float32(
                vect_float32_to_float8e5m2(input_values)
            )
        elif to_type == "FLOAT8E5M2FNUZ":
            expected = float8e5m2_to_float32(
                vect_float32_to_float8e5m2(input_values, fn=True, uz=True),
                fn=True,
                uz=True,
            )
        elif to_type == "FLOAT16":
            expected = input_values.astype(np.float16).astype(np.float32)
        elif to_type == "FLOAT":
            expected = input_values
        else:
            raise ValueError(
                "Conversion from {from_type} to {to_type} is not tested."
            )
        expected_tensor = make_tensor(
            "x", getattr(TensorProto, to_type), [3, 5], expected.tolist()
        )
        output = expected_tensor
    elif from_type in ("UINT4", "INT4") or to_type in ("UINT4", "INT4"):
        np_fp32 = np.arange(-9, 16).astype(np.float32)
        input_shape = (5, 5)
        if from_type == "FLOAT":
            input_values = np_fp32
            input = make_tensor(
                "x", TensorProto.FLOAT, input_shape, input_values.tolist()
            )
        elif from_type == "FLOAT16":
            input_values = np_fp32.astype(np.float16)
            input = make_tensor(
                "x", TensorProto.FLOAT16, input_shape, input_values.tolist()
            )
        elif from_type == "UINT4":
            input_values = vect_float32_to_uint4(np_fp32)
            input = make_tensor(
                "x", TensorProto.UINT4, input_shape, input_values.tolist()
            )
        elif from_type == "INT4":
            input_values = vect_float32_to_int4(np_fp32)
            input = make_tensor(
                "x", TensorProto.INT4, input_shape, input_values.tolist()
            )
        else:
            raise ValueError(
                "Conversion from {from_type} to {to_type} is not tested."
            )
        if to_type == "UINT4":
            expected = vect_float32_to_uint4(input_values).astype(custom.uint4)
        elif to_type == "INT4":
            expected = vect_float32_to_int4(input_values).astype(custom.int4)
        elif to_type == "FLOAT16":
            expected = input_values.astype(np.float16)
        elif to_type == "FLOAT":
            expected = input_values
        elif to_type == "UINT8":
            expected = input_values.astype(np.uint8)
        elif to_type == "INT8":
            expected = input_values.astype(np.int8)
        else:
            raise ValueError(
                "Conversion from {from_type} to {to_type} is not tested."
            )
        expected_tensor = make_tensor(
            "y", getattr(TensorProto, to_type), input_shape, expected.tolist()
        )
        output = expected_tensor
        input_type_proto = onnx.helper.make_tensor_type_proto(
            getattr(TensorProto, from_type), input_shape
        )
        output_type_proto = onnx.helper.make_tensor_type_proto(
            getattr(TensorProto, to_type), input_shape
        )

    elif from_type != "STRING":
        input = np.random.random_sample(shape).astype(
            helper.tensor_dtype_to_np_dtype(getattr(TensorProto, from_type))
        )
        if to_type == "STRING":
            # Converting input to str, then give it object dtype for generating script
            ss = []
            for i in input.flatten():
                s = str(i).encode("utf-8")
                su = s.decode("utf-8")
                ss.append(su)

            output = np.array(ss).astype(object).reshape([3, 4])
        else:
            output = input.astype(
                helper.tensor_dtype_to_np_dtype(getattr(TensorProto, to_type))
            )
    else:
        input = np.array(
            [
                "0.47892547",
                "0.48033667",
                "0.49968487",
                "0.81910545",
                "0.47031248",
                "0.816468",
                "0.21087195",
                "0.7229038",
                "NaN",
                "INF",
                "+INF",
                "-INF",
            ],
            dtype=np.dtype(object),
        ).reshape([3, 4])
        output = input.astype(
            helper.tensor_dtype_to_np_dtype(getattr(TensorProto, to_type))
        )
    node = onnx.helper.make_node(
        "Cast",
        inputs=["input"],
        outputs=["output"],
        to=getattr(TensorProto, to_type),
    )
    if input_type_proto and output_type_proto:
        expect(
            node,
            inputs=[input],
            outputs=[output],
            name="test_cast_" + from_type + "_to_" + to_type,
            input_type_protos=[input_type_proto],
            output_type_protos=[output_type_proto],
        )
    else:
        expect(
            node,
            inputs=[input],
            outputs=[output],
            name="test_cast_" + from_type + "_to_" + to_type,
        )
```

</details>


<details>
<summary>saturate_false</summary>

```python
test_cases = [
    ("FLOAT", "FLOAT8E4M3FN"),
    ("FLOAT16", "FLOAT8E4M3FN"),
    ("FLOAT", "FLOAT8E4M3FNUZ"),
    ("FLOAT16", "FLOAT8E4M3FNUZ"),
    ("FLOAT", "FLOAT8E5M2"),
    ("FLOAT16", "FLOAT8E5M2"),
    ("FLOAT", "FLOAT8E5M2FNUZ"),
    ("FLOAT16", "FLOAT8E5M2FNUZ"),
]
vect_float32_to_float8e4m3 = np.vectorize(float32_to_float8e4m3)
vect_float32_to_float8e5m2 = np.vectorize(float32_to_float8e5m2)

for from_type, to_type in test_cases:
    np_fp32 = np.array(
        [
            "0.47892547",
            "0.48033667",
            "0.49968487",
            "0.81910545",
            "0.47031248",
            "0.7229038",
            "1000000",
            "1e-7",
            "NaN",
            "INF",
            "+INF",
            "-INF",
            "-0.0000001",
            "0.0000001",
            "-1000000",
        ],
        dtype=np.float32,
    )

    if from_type == "FLOAT":
        input_values = np_fp32
        input = make_tensor("x", TensorProto.FLOAT, [3, 5], np_fp32.tolist())
    elif from_type == "FLOAT16":
        input_values = np_fp32.astype(np.float16).astype(np.float32)
        input = make_tensor(
            "x", TensorProto.FLOAT16, [3, 5], input_values.tolist()
        )
    else:
        raise ValueError(
            "Conversion from {from_type} to {to_type} is not tested."
        )

    if to_type == "FLOAT8E4M3FN":
        expected = vect_float32_to_float8e4m3(input_values, saturate=False)
    elif to_type == "FLOAT8E4M3FNUZ":
        expected = vect_float32_to_float8e4m3(
            input_values, uz=True, saturate=False
        )
    elif to_type == "FLOAT8E5M2":
        expected = vect_float32_to_float8e5m2(input_values, saturate=False)
    elif to_type == "FLOAT8E5M2FNUZ":
        expected = vect_float32_to_float8e5m2(
            input_values, fn=True, uz=True, saturate=False
        )
    else:
        raise ValueError(
            "Conversion from {from_type} to {to_type} is not tested."
        )

    ivals = bytes([int(i) for i in expected])
    tensor = TensorProto()
    tensor.data_type = getattr(TensorProto, to_type)
    tensor.name = "x"
    tensor.dims.extend([3, 5])
    field = tensor_dtype_to_field(tensor.data_type)
    getattr(tensor, field).extend(ivals)

    output = tensor

    node = onnx.helper.make_node(
        "Cast",
        inputs=["input"],
        outputs=["output"],
        to=getattr(TensorProto, to_type),
        saturate=0,
    )
    expect(
        node,
        inputs=[input],
        outputs=[output],
        name="test_cast_no_saturate_" + from_type + "_to_" + to_type,
    )
```

</details>


### <a name="CastLike"></a><a name="castlike">**CastLike**</a>

  The operator casts the elements of a given input tensor (the first input) to
  the same data type as the elements of the second input tensor.
  See documentation of the Cast operator for further details.

#### Version

This version of the operator has been available since version 21 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#CastLike-15">15</a>, <a href="Changelog.md#CastLike-19">19</a>

#### Attributes

<dl>
<dt><tt>saturate</tt> : int (default is 1)</dt>
<dd>The parameter defines how the conversion behaves if an input value is out of range of the destination type. It only applies for float 8 conversion (float8e4m3fn, float8e4m3fnuz, float8e5m2, float8e5m2fnuz). It is true by default. Please refer to operator Cast description for further details.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T1</dt>
<dd>Input tensor to be cast.</dd>
<dt><tt>target_type</tt> (non-differentiable) : T2</dt>
<dd>The (first) input tensor will be cast to produce a tensor of the same type as this (second input) tensor.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T2</dt>
<dd>Output tensor produced by casting the first input tensor to have the same type as the second input tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(float16), tensor(float), tensor(double), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(bool), tensor(string), tensor(bfloat16), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4)</dt>
<dd>Constrain input types. Casting from complex is not supported.</dd>
<dt><tt>T2</tt> : tensor(float16), tensor(float), tensor(double), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(bool), tensor(string), tensor(bfloat16), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4)</dt>
<dd>Constrain output types. Casting to complex is not supported.</dd>
</dl>


#### Examples

<details>
<summary>castlike</summary>

```python
shape = (3, 4)
test_cases = [
    ("FLOAT", "FLOAT16"),
    ("FLOAT", "DOUBLE"),
    ("FLOAT16", "FLOAT"),
    ("FLOAT16", "DOUBLE"),
    ("DOUBLE", "FLOAT"),
    ("DOUBLE", "FLOAT16"),
    ("FLOAT", "STRING"),
    ("STRING", "FLOAT"),
    ("FLOAT", "BFLOAT16"),
    ("BFLOAT16", "FLOAT"),
    ("FLOAT", "FLOAT8E4M3FN"),
    ("FLOAT", "FLOAT8E4M3FNUZ"),
    ("FLOAT8E4M3FN", "FLOAT"),
    ("FLOAT8E4M3FNUZ", "FLOAT"),
    ("FLOAT", "FLOAT8E5M2"),
    ("FLOAT", "FLOAT8E5M2FNUZ"),
    ("FLOAT8E5M2", "FLOAT"),
    ("FLOAT8E5M2FNUZ", "FLOAT"),
]

vect_float32_to_float8e4m3 = np.vectorize(float32_to_float8e4m3)
vect_float32_to_float8e5m2 = np.vectorize(float32_to_float8e5m2)

for from_type, to_type in test_cases:
    input_type_proto = None
    output_type_proto = None
    if from_type == "BFLOAT16" or to_type == "BFLOAT16":
        np_fp32 = np.array(
            [
                "0.47892547",
                "0.48033667",
                "0.49968487",
                "0.81910545",
                "0.47031248",
                "0.816468",
                "0.21087195",
                "0.7229038",
                "NaN",
                "INF",
                "+INF",
                "-INF",
            ],
            dtype=np.float32,
        )
        little_endisan = sys.byteorder == "little"
        np_uint16_view = np_fp32.view(dtype=np.uint16)
        np_bfp16 = (
            np_uint16_view[1::2] if little_endisan else np_uint16_view[0::2]
        )
        if to_type == "BFLOAT16":
            assert from_type == "FLOAT"
            input = np_fp32.reshape([3, 4])
            output = np_bfp16.reshape([3, 4])
            input_type_proto = onnx.helper.make_tensor_type_proto(
                int(TensorProto.FLOAT), input.shape
            )
            output_type_proto = onnx.helper.make_tensor_type_proto(
                int(TensorProto.BFLOAT16), output.shape
            )
        else:
            assert to_type == "FLOAT"
            input = np_bfp16.reshape([3, 4])
            # convert bfloat to FLOAT
            np_fp32_zeros = np.zeros((len(np_bfp16) * 2,), dtype=np.uint16)
            if little_endisan:
                np_fp32_zeros[1::2] = np_bfp16
            else:
                np_fp32_zeros[0::2] = np_bfp16
            np_fp32_from_bfloat = np_fp32_zeros.view(dtype=np.float32)
            output = np_fp32_from_bfloat.reshape([3, 4])
            input_type_proto = onnx.helper.make_tensor_type_proto(
                int(TensorProto.BFLOAT16), input.shape
            )
            output_type_proto = onnx.helper.make_tensor_type_proto(
                int(TensorProto.FLOAT), output.shape
            )
        like = output.flatten()[0:1]
    elif from_type in (
        "FLOAT8E4M3FN",
        "FLOAT8E4M3FNUZ",
        "FLOAT8E5M2",
        "FLOAT8E5M2FNUZ",
    ) or to_type in (
        "FLOAT8E4M3FN",
        "FLOAT8E4M3FNUZ",
        "FLOAT8E5M2",
        "FLOAT8E5M2FNUZ",
    ):
        np_fp32 = np.array(
            [
                "0.47892547",
                "0.48033667",
                "0.49968487",
                "0.81910545",
                "0.47031248",
                "0.816468",
                "0.21087195",
                "0.7229038",
                "NaN",
                "INF",
                "+INF",
                "-INF",
            ],
            dtype=np.float32,
        )
        if to_type == "FLOAT8E4M3FN":
            expected = float8e4m3_to_float32(
                vect_float32_to_float8e4m3(np_fp32)
            )
            expected_tensor = make_tensor(
                "x", TensorProto.FLOAT8E4M3FN, [3, 4], expected.tolist()
            )
            like_tensor = make_tensor(
                "x", TensorProto.FLOAT8E4M3FN, [1], expected[:1]
            )
        elif to_type == "FLOAT8E4M3FNUZ":
            expected = float8e4m3_to_float32(
                vect_float32_to_float8e4m3(np_fp32, uz=True), uz=True
            )
            expected_tensor = make_tensor(
                "x", TensorProto.FLOAT8E4M3FNUZ, [3, 4], expected.tolist()
            )
            like_tensor = make_tensor(
                "x", TensorProto.FLOAT8E4M3FNUZ, [1], expected[:1]
            )
        elif to_type == "FLOAT8E5M2":
            expected = float8e5m2_to_float32(
                vect_float32_to_float8e5m2(np_fp32)
            )
            expected_tensor = make_tensor(
                "x", TensorProto.FLOAT8E5M2, [3, 4], expected.tolist()
            )
            like_tensor = make_tensor(
                "x", TensorProto.FLOAT8E5M2, [1], expected[:1]
            )
        elif to_type == "FLOAT8E5M2FNUZ":
            expected = float8e5m2_to_float32(
                vect_float32_to_float8e5m2(np_fp32, fn=True, uz=True),
                fn=True,
                uz=True,
            )
            expected_tensor = make_tensor(
                "x", TensorProto.FLOAT8E5M2FNUZ, [3, 4], expected.tolist()
            )
            like_tensor = make_tensor(
                "x", TensorProto.FLOAT8E5M2FNUZ, [1], expected[:1]
            )
        if from_type == "FLOAT":
            input = np_fp32.reshape((3, 4))
            output = expected_tensor
            like = like_tensor
        else:
            assert to_type == "FLOAT"
            input = expected_tensor
            output = expected.reshape((3, 4))
            like = output.flatten()[:1]
    elif from_type != "STRING":
        input = np.random.random_sample(shape).astype(
            helper.tensor_dtype_to_np_dtype(getattr(TensorProto, from_type))
        )
        if to_type == "STRING":
            # Converting input to str, then give it object dtype for generating script
            ss = []
            for i in input.flatten():
                s = str(i).encode("utf-8")
                su = s.decode("utf-8")
                ss.append(su)

            output = np.array(ss).astype(object).reshape([3, 4])
        else:
            output = input.astype(
                helper.tensor_dtype_to_np_dtype(getattr(TensorProto, to_type))
            )
        like = output.flatten()[0:1]
    else:
        input = np.array(
            [
                "0.47892547",
                "0.48033667",
                "0.49968487",
                "0.81910545",
                "0.47031248",
                "0.816468",
                "0.21087195",
                "0.7229038",
                "NaN",
                "INF",
                "+INF",
                "-INF",
            ],
            dtype=np.dtype(object),
        ).reshape([3, 4])
        output = input.astype(
            helper.tensor_dtype_to_np_dtype(getattr(TensorProto, to_type))
        )
        like = output.flatten()[0:1]
    node = onnx.helper.make_node(
        "CastLike",
        inputs=["input", "like"],
        outputs=["output"],
    )
    if input_type_proto and output_type_proto:
        like_type_proto = onnx.helper.make_tensor_type_proto(
            output_type_proto.tensor_type.elem_type, like.shape
        )

        expect(
            node,
            inputs=[input, like],
            outputs=[output],
            name="test_castlike_" + from_type + "_to_" + to_type,
            input_type_protos=[input_type_proto, like_type_proto],
            output_type_protos=[output_type_proto],
        )
    else:
        expect(
            node,
            inputs=[input, like],
            outputs=[output],
            name="test_castlike_" + from_type + "_to_" + to_type,
        )
```

</details>


### <a name="Ceil"></a><a name="ceil">**Ceil**</a>

  Ceil takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the ceil is, y = ceil(x), is applied to
  the tensor elementwise. If x is integral, +0, -0, NaN,  or infinite, x itself is returned.

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Ceil-1">1</a>, <a href="Changelog.md#Ceil-6">6</a>

#### Inputs

<dl>
<dt><tt>X</tt> (non-differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (non-differentiable) : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>ceil</summary>

```python
node = onnx.helper.make_node(
    "Ceil",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([-1.5, 1.2]).astype(np.float32)
y = np.ceil(x)  # expected output [-1., 2.]
expect(node, inputs=[x], outputs=[y], name="test_ceil_example")

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.ceil(x)
expect(node, inputs=[x], outputs=[y], name="test_ceil")
```

</details>


### <a name="Celu"></a><a name="celu">**Celu**</a>

  Continuously Differentiable Exponential Linear Units:
  Perform the linear unit element-wise on the input tensor X
  using formula:

  ```
  max(0,x) + min(0,alpha*(exp(x/alpha)-1))
  ```

#### Version

This version of the operator has been available since version 12 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>alpha</tt> : float (default is 1.0)</dt>
<dd>The Alpha value in Celu formula which control the shape of the unit. The default value is 1.0.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float)</dt>
<dd>Constrain input and output types to float32 tensors.</dd>
</dl>


#### Examples

<details>
<summary>celu</summary>

```python
alpha = 2.0
node = onnx.helper.make_node(
    "Celu",
    inputs=["X"],
    outputs=["Y"],
    alpha=alpha,
)

input_data = np.array(
    [
        [
            [[0.8439683], [0.5665144], [0.05836735]],
            [[0.02916367], [0.12964272], [0.5060197]],
            [[0.79538304], [0.9411346], [0.9546573]],
        ],
        [
            [[0.17730942], [0.46192095], [0.26480448]],
            [[0.6746842], [0.01665257], [0.62473077]],
            [[0.9240844], [0.9722341], [0.11965699]],
        ],
        [
            [[0.41356155], [0.9129373], [0.59330076]],
            [[0.81929934], [0.7862604], [0.11799799]],
            [[0.69248444], [0.54119414], [0.07513223]],
        ],
    ],
    dtype=np.float32,
)

# Calculate expected output data
positive_input = np.maximum(0, input_data)
negative_input = np.minimum(0, alpha * (np.exp(input_data / alpha) - 1))
expected_output = positive_input + negative_input

expect(node, inputs=[input_data], outputs=[expected_output], name="test_celu")
```

</details>


### <a name="CenterCropPad"></a><a name="centercroppad">**CenterCropPad**</a>

  Center crop or pad an input to given dimensions.

  The crop/pad dimensions can be specified for a subset of the `axes`. Non-specified dimensions will not be
  cropped or padded.

  If the input dimensions are bigger than the crop shape, a centered cropping window is extracted from the input.
  If the input dimensions are smaller than the crop shape, the input is padded on each side equally,
  so that the input is centered in the output.

#### Version

This version of the operator has been available since version 18 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axes</tt> : list of ints</dt>
<dd>If provided, it specifies a subset of axes that 'shape' refer to. If not provided, all axes are assumed [0, 1, ..., r-1], where r = rank(data). Negative value means counting dimensions from the back. Accepted range is [-r, r-1], where r = rank(data). Behavior is undefined if an axis is repeated.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input_data</tt> (differentiable) : T</dt>
<dd>Input to extract the centered crop from.</dd>
<dt><tt>shape</tt> (non-differentiable) : Tind</dt>
<dd>1-D tensor representing the cropping window dimensions.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output_data</tt> (differentiable) : T</dt>
<dd>Output data.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain input and output types to all tensor types.</dd>
<dt><tt>Tind</tt> : tensor(int32), tensor(int64)</dt>
<dd>Constrain indices to integer types</dd>
</dl>


#### Examples

<details>
<summary>center_crop_pad_crop</summary>

```python
node = onnx.helper.make_node(
    "CenterCropPad",
    inputs=["x", "shape"],
    outputs=["y"],
)

# First dim is even diff, second is uneven
x = np.random.randn(20, 10, 3).astype(np.float32)
shape = np.array([10, 7, 3], dtype=np.int64)
y = x[5:15, 1:8, :]

expect(node, inputs=[x, shape], outputs=[y], name="test_center_crop_pad_crop")
```

</details>


<details>
<summary>center_crop_pad_crop_and_pad</summary>

```python
node = onnx.helper.make_node(
    "CenterCropPad",
    inputs=["x", "shape"],
    outputs=["y"],
)

# Cropping on first dim, padding on second, third stays the same
x = np.random.randn(20, 8, 3).astype(np.float32)
shape = np.array([10, 10, 3], dtype=np.int64)
y = np.zeros([10, 10, 3], dtype=np.float32)
y[:, 1:9, :] = x[5:15, :, :]

expect(
    node,
    inputs=[x, shape],
    outputs=[y],
    name="test_center_crop_pad_crop_and_pad",
)
```

</details>


<details>
<summary>center_crop_pad_crop_axes_chw</summary>

```python
node = onnx.helper.make_node(
    "CenterCropPad",
    inputs=["x", "shape"],
    outputs=["y"],
    axes=[1, 2],
)

# Cropping on second dim, padding on third, first stays the same
x = np.random.randn(3, 20, 8).astype(np.float32)
shape = np.array([10, 9], dtype=np.int64)
y = np.zeros([3, 10, 9], dtype=np.float32)
y[:, :, :8] = x[:, 5:15, :]

expect(
    node,
    inputs=[x, shape],
    outputs=[y],
    name="test_center_crop_pad_crop_axes_chw",
)
```

</details>


<details>
<summary>center_crop_pad_crop_axes_hwc</summary>

```python
node = onnx.helper.make_node(
    "CenterCropPad",
    inputs=["x", "shape"],
    outputs=["y"],
    axes=[0, 1],
)

# Cropping on first dim, padding on second, third stays the same
x = np.random.randn(20, 8, 3).astype(np.float32)
shape = np.array([10, 9], dtype=np.int64)
y = np.zeros([10, 9, 3], dtype=np.float32)
y[:, :8, :] = x[5:15, :, :]

expect(
    node,
    inputs=[x, shape],
    outputs=[y],
    name="test_center_crop_pad_crop_axes_hwc",
)
```

</details>


<details>
<summary>center_crop_pad_crop_negative_axes_hwc</summary>

```python
node = onnx.helper.make_node(
    "CenterCropPad",
    inputs=["x", "shape"],
    outputs=["y"],
    axes=[-3, -2],
)

# Cropping on first dim, padding on second, third stays the same
x = np.random.randn(20, 8, 3).astype(np.float32)
shape = np.array([10, 9], dtype=np.int64)
y = np.zeros([10, 9, 3], dtype=np.float32)
y[:, :8, :] = x[5:15, :, :]

expect(
    node,
    inputs=[x, shape],
    outputs=[y],
    name="test_center_crop_pad_crop_negative_axes_hwc",
)
```

</details>


<details>
<summary>center_crop_pad_pad</summary>

```python
node = onnx.helper.make_node(
    "CenterCropPad",
    inputs=["x", "shape"],
    outputs=["y"],
)

# First dim is even diff, second is uneven
x = np.random.randn(10, 7, 3).astype(np.float32)
shape = np.array([20, 10, 3], dtype=np.int64)
y = np.zeros([20, 10, 3], dtype=np.float32)
y[5:15, 1:8, :] = x

expect(node, inputs=[x, shape], outputs=[y], name="test_center_crop_pad_pad")
```

</details>


### <a name="Clip"></a><a name="clip">**Clip**</a>

  Clip operator limits the given input within an interval. The interval is
  specified by the inputs 'min' and 'max'. They default to
  numeric_limits::lowest() and numeric_limits::max(), respectively.

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Clip-1">1</a>, <a href="Changelog.md#Clip-6">6</a>, <a href="Changelog.md#Clip-11">11</a>, <a href="Changelog.md#Clip-12">12</a>

#### Inputs (1 - 3)

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>Input tensor whose elements to be clipped</dd>
<dt><tt>min</tt> (optional, non-differentiable) : T</dt>
<dd>Minimum value, under which element is replaced by min. It must be a scalar(tensor of empty shape).</dd>
<dt><tt>max</tt> (optional, non-differentiable) : T</dt>
<dd>Maximum value, above which element is replaced by max. It must be a scalar(tensor of empty shape).</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>Output tensor with clipped input elements</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to all numeric tensors.</dd>
</dl>


#### Examples

<details>
<summary>clip</summary>

```python
node = onnx.helper.make_node(
    "Clip",
    inputs=["x", "min", "max"],
    outputs=["y"],
)

x = np.array([-2, 0, 2]).astype(np.float32)
min_val = np.float32(-1)
max_val = np.float32(1)
y = np.clip(x, min_val, max_val)  # expected output [-1., 0., 1.]
expect(
    node, inputs=[x, min_val, max_val], outputs=[y], name="test_clip_example"
)

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.clip(x, min_val, max_val)
expect(node, inputs=[x, min_val, max_val], outputs=[y], name="test_clip")
node = onnx.helper.make_node(
    "Clip",
    inputs=["x", "min", "max"],
    outputs=["y"],
)

min_val = np.float32(-5)
max_val = np.float32(5)

x = np.array([-1, 0, 1]).astype(np.float32)
y = np.array([-1, 0, 1]).astype(np.float32)
expect(
    node, inputs=[x, min_val, max_val], outputs=[y], name="test_clip_inbounds"
)

x = np.array([-6, 0, 6]).astype(np.float32)
y = np.array([-5, 0, 5]).astype(np.float32)
expect(
    node, inputs=[x, min_val, max_val], outputs=[y], name="test_clip_outbounds"
)

x = np.array([-1, 0, 6]).astype(np.float32)
y = np.array([-1, 0, 5]).astype(np.float32)
expect(
    node,
    inputs=[x, min_val, max_val],
    outputs=[y],
    name="test_clip_splitbounds",
)
```

</details>


<details>
<summary>clip_default</summary>

```python
node = onnx.helper.make_node(
    "Clip",
    inputs=["x", "min"],
    outputs=["y"],
)
min_val = np.float32(0)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.clip(x, min_val, np.inf)
expect(node, inputs=[x, min_val], outputs=[y], name="test_clip_default_min")

no_min = ""  # optional input, not supplied
node = onnx.helper.make_node(
    "Clip",
    inputs=["x", no_min, "max"],
    outputs=["y"],
)
max_val = np.float32(0)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.clip(x, -np.inf, max_val)
expect(node, inputs=[x, max_val], outputs=[y], name="test_clip_default_max")

no_max = ""  # optional input, not supplied
node = onnx.helper.make_node(
    "Clip",
    inputs=["x", no_min, no_max],
    outputs=["y"],
)

x = np.array([-1, 0, 1]).astype(np.float32)
y = np.array([-1, 0, 1]).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_clip_default_inbounds")
```

</details>


<details>
<summary>clip_default_int8</summary>

```python
node = onnx.helper.make_node(
    "Clip",
    inputs=["x", "min"],
    outputs=["y"],
)
min_val = np.int8(0)
x = np.random.randn(3, 4, 5).astype(np.int8)
y = np.clip(x, min_val, np.iinfo(np.int8).max)
expect(
    node, inputs=[x, min_val], outputs=[y], name="test_clip_default_int8_min"
)

no_min = ""  # optional input, not supplied
node = onnx.helper.make_node(
    "Clip",
    inputs=["x", no_min, "max"],
    outputs=["y"],
)
max_val = np.int8(0)
x = np.random.randn(3, 4, 5).astype(np.int8)
y = np.clip(x, np.iinfo(np.int8).min, max_val)
expect(
    node, inputs=[x, max_val], outputs=[y], name="test_clip_default_int8_max"
)

no_max = ""  # optional input, not supplied
node = onnx.helper.make_node(
    "Clip",
    inputs=["x", no_min, no_max],
    outputs=["y"],
)

x = np.array([-1, 0, 1]).astype(np.int8)
y = np.array([-1, 0, 1]).astype(np.int8)
expect(node, inputs=[x], outputs=[y], name="test_clip_default_int8_inbounds")
```

</details>


### <a name="Col2Im"></a><a name="col2im">**Col2Im**</a>

  The operator rearranges column blocks back into a multidimensional image

  Col2Im behaves similarly to PyTorch's fold https://pytorch.org/docs/stable/generated/torch.nn.Fold.html,
  but it only supports *batched* multi-dimensional image tensors.
  Another implementation in Python with N-dimension support can be found at https://github.com/f-dangel/unfoldNd/.

  NOTE:
    Although specifying image_shape looks redundant because it could be calculated from
    convolution formulas, it is required as input for more advanced scenarios as explained
    at PyTorch's implementation (https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/Col2Im.cpp#L10)

#### Version

This version of the operator has been available since version 18 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>dilations</tt> : list of ints</dt>
<dd>1-dimensional tensor with dilation value along each spatial axis of the image. If not present, the dilation defaults to 1 along each spatial axis of the image.</dd>
<dt><tt>pads</tt> : list of ints</dt>
<dd>1-dimensional tensor with padding value for the beginning and ending along each spatial axis, it can take any value greater than or equal to 0. The value represent the number of pixels added to the beginning and end part of the corresponding axis. `pads` format should be as follow [x1_begin, x2_begin...x1_end, x2_end,...], where xi_begin is the number of pixels added at the beginning of axis `i` and xi_end is the number of pixels added at the end of axis `i`. If not present, the padding defaults to 0 along start and end of each spatial axis.</dd>
<dt><tt>strides</tt> : list of ints</dt>
<dd>1-dimensional tensor with stride value along each spatial axis. If not present, the stride defaults to 1 along each spatial axis.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>Input data tensor to be rearranged from column blocks back into an image. This is a 3-dimensional tensor containing [N, C * n-ary-product(block_shape), L], where N is batch dimension, C is image channel dimension and L is number of blocks.The blocks are enumerated in increasing lexicographic-order of their indices.For example, with an image-size 10*20 and block-size 9*18, there would be 2*3 blocks, enumerated in the order block(0, 0), block(0, 1), block(0, 2), block(1, 0), block(1, 1), block(1, 2).</dd>
<dt><tt>image_shape</tt> (non-differentiable) : tensor(int64)</dt>
<dd>The shape of the spatial dimensions of the image after rearranging the column blocks.This is a 1-dimensional tensor with size of at least 2, containing the value [H_img, W_img]  for a 2-D image or [dim_i1, dim_i2, ..., dim_iN] for a N-D image.</dd>
<dt><tt>block_shape</tt> (non-differentiable) : tensor(int64)</dt>
<dd>The shape of the block to apply on the input.This is a 1-dimensional tensor of size of at least 2, containing the value [H_block, W_block]  for a 2-D image or [dim_b1, dim_b2, ..., dim_bN] for a N-D block.This is the block-shape before dilation is applied to it.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>Output tensor produced by rearranging blocks into an image.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain input and output types to all numeric tensor types.</dd>
</dl>


#### Examples

<details>
<summary>col2im</summary>

```python
input = np.array(
    [
        [
            [1.0, 6.0, 11.0, 16.0, 21.0],  # (1, 5, 5)
            [2.0, 7.0, 12.0, 17.0, 22.0],
            [3.0, 8.0, 13.0, 18.0, 23.0],
            [4.0, 9.0, 14.0, 19.0, 24.0],
            [5.0, 0.0, 15.0, 20.0, 25.0],
        ]
    ]
).astype(np.float32)

image_shape = np.array([5, 5]).astype(np.int64)
block_shape = np.array([1, 5]).astype(np.int64)
node = onnx.helper.make_node(
    "Col2Im", ["input", "image_shape", "block_shape"], ["output"]
)

output = np.array(
    [
        [
            [
                [1.0, 2.0, 3.0, 4.0, 5.0],  # (1, 1, 5, 5)
                [6.0, 7.0, 8.0, 9.0, 0.0],
                [11.0, 12.0, 13.0, 14.0, 15.0],
                [16.0, 17.0, 18.0, 19.0, 20.0],
                [21.0, 22.0, 23.0, 24.0, 25.0],
            ]
        ]
    ]
).astype(np.float32)

expect(
    node,
    inputs=[input, image_shape, block_shape],
    outputs=[output],
    name="test_col2im",
)
```

</details>


<details>
<summary>col2im_5d</summary>

```python
input = np.array(
    [
        [
            [1, 6, 11, 16, 21, 26, 31, 36, 41, 46, 51, 56],  # (1, 10, 12)
            [2, 7, 12, 17, 22, 27, 32, 37, 42, 47, 52, 57],
            [3, 8, 13, 18, 23, 28, 33, 38, 43, 48, 53, 58],
            [4, 9, 14, 19, 24, 29, 34, 39, 44, 49, 54, 59],
            [5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60],
            [61, 66, 71, 76, 81, 86, 91, 96, 101, 106, 111, 116],
            [62, 67, 72, 77, 82, 87, 92, 97, 102, 107, 112, 117],
            [63, 68, 73, 78, 83, 88, 93, 98, 103, 108, 113, 118],
            [64, 69, 74, 79, 84, 89, 94, 99, 104, 109, 114, 119],
            [65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120],
        ]
    ]
).astype(np.float32)
image_shape = np.array([3, 4, 5]).astype(np.int64)
block_shape = np.array([1, 1, 5]).astype(np.int64)

output = np.array(
    [
        [
            [
                [
                    [1, 2, 3, 4, 5],  # (1, 2, 3, 4, 5)
                    [6, 7, 8, 9, 10],
                    [11, 12, 13, 14, 15],
                    [16, 17, 18, 19, 20],
                ],
                [
                    [21, 22, 23, 24, 25],
                    [26, 27, 28, 29, 30],
                    [31, 32, 33, 34, 35],
                    [36, 37, 38, 39, 40],
                ],
                [
                    [41, 42, 43, 44, 45],
                    [46, 47, 48, 49, 50],
                    [51, 52, 53, 54, 55],
                    [56, 57, 58, 59, 60],
                ],
            ],
            [
                [
                    [61, 62, 63, 64, 65],
                    [66, 67, 68, 69, 70],
                    [71, 72, 73, 74, 75],
                    [76, 77, 78, 79, 80],
                ],
                [
                    [81, 82, 83, 84, 85],
                    [86, 87, 88, 89, 90],
                    [91, 92, 93, 94, 95],
                    [96, 97, 98, 99, 100],
                ],
                [
                    [101, 102, 103, 104, 105],
                    [106, 107, 108, 109, 110],
                    [111, 112, 113, 114, 115],
                    [116, 117, 118, 119, 120],
                ],
            ],
        ]
    ]
).astype(np.float32)

node = onnx.helper.make_node(
    "Col2Im", ["input", "image_shape", "block_shape"], ["output"]
)
expect(
    node,
    inputs=[input, image_shape, block_shape],
    outputs=[output],
    name="test_col2im_5d",
)
```

</details>


<details>
<summary>col2im_dilations</summary>

```python
input = np.array(
    [
        [
            [1.0, 5.0, 9.0, 13.0, 17],  # (1, 4, 5)
            [2.0, 6.0, 10.0, 14.0, 18],
            [3.0, 7.0, 11.0, 15.0, 19],
            [4.0, 8.0, 12.0, 16.0, 20],
        ]
    ]
).astype(np.float32)
image_shape = np.array([6, 6]).astype(np.int64)
block_shape = np.array([2, 2]).astype(np.int64)

output = np.array(
    [
        [
            [
                [1.0, 0.0, 0.0, 0.0, 0.0, 2.0],  # (1, 1, 6, 6)
                [8.0, 0.0, 0.0, 0.0, 0.0, 10.0],
                [16.0, 0.0, 0.0, 0.0, 0.0, 18.0],
                [24.0, 0.0, 0.0, 0.0, 0.0, 26.0],
                [32.0, 0.0, 0.0, 0.0, 0.0, 34.0],
                [19.0, 0.0, 0.0, 0.0, 0.0, 20.0],
            ]
        ]
    ]
).astype(np.float32)

node = onnx.helper.make_node(
    "Col2Im",
    ["input", "image_shape", "block_shape"],
    ["output"],
    dilations=[1, 5],
)
expect(
    node,
    inputs=[input, image_shape, block_shape],
    outputs=[output],
    name="test_col2im_dilations",
)
```

</details>


<details>
<summary>col2im_pads</summary>

```python
input = np.array(
    [
        [
            [
                1.0,
                6.0,
                11.0,
                16.0,
                21.0,
                26,
                31,
                36,
                41,
                46,
                51,
                56,
                61,
                66,
                71,
            ],  # (1, 5, 15)
            [
                2.0,
                7.0,
                12.0,
                17.0,
                22.0,
                27,
                32,
                37,
                42,
                47,
                52,
                57,
                62,
                67,
                72,
            ],
            [
                3.0,
                8.0,
                13.0,
                18.0,
                23.0,
                28,
                33,
                38,
                43,
                48,
                53,
                58,
                63,
                68,
                73,
            ],
            [
                4.0,
                9.0,
                14.0,
                19.0,
                24.0,
                29,
                34,
                39,
                44,
                49,
                54,
                59,
                64,
                69,
                74,
            ],
            [
                5.0,
                10.0,
                15.0,
                20.0,
                25.0,
                30,
                35,
                40,
                45,
                50,
                55,
                60,
                65,
                70,
                75,
            ],
        ]
    ]
).astype(np.float32)
image_shape = np.array([5, 5]).astype(np.int64)
block_shape = np.array([1, 5]).astype(np.int64)

output = np.array(
    [
        [
            [
                [8.0, 21.0, 24.0, 27.0, 24.0],  # (1, 1, 5, 5)
                [38.0, 66.0, 69.0, 72.0, 54.0],
                [68.0, 111.0, 114.0, 117.0, 84.0],
                [98.0, 156.0, 159.0, 162.0, 114.0],
                [128.0, 201.0, 204.0, 207.0, 144.0],
            ]
        ]
    ]
).astype(np.float32)

node = onnx.helper.make_node(
    "Col2Im",
    ["input", "image_shape", "block_shape"],
    ["output"],
    pads=[0, 1, 0, 1],
)
expect(
    node,
    inputs=[input, image_shape, block_shape],
    outputs=[output],
    name="test_col2im_pads",
)
```

</details>


<details>
<summary>col2im_strides</summary>

```python
input = np.array(
    [
        [
            [0.0, 0.0, 0.0, 0.0],  # (1, 9, 4)
            [1.0, 1.0, 1.0, 1.0],
            [1.0, 1.0, 1.0, 1.0],
            [1.0, 1.0, 1.0, 1.0],
            [0.0, 0.0, 0.0, 0.0],
            [0.0, 0.0, 0.0, 0.0],
            [0.0, 0.0, 0.0, 0.0],
            [1.0, 1.0, 1.0, 1.0],
            [0.0, 0.0, 0.0, 0.0],
        ]
    ]
).astype(np.float32)
image_shape = np.array([5, 5]).astype(np.int64)
block_shape = np.array([3, 3]).astype(np.int64)

output = np.array(
    [
        [
            [
                [0.0, 1.0, 1.0, 1.0, 1.0],  # (1, 1, 5, 5)
                [1.0, 0.0, 1.0, 0.0, 0.0],
                [0.0, 2.0, 1.0, 2.0, 1.0],
                [1.0, 0.0, 1.0, 0.0, 0.0],
                [0.0, 1.0, 0.0, 1.0, 0.0],
            ]
        ]
    ]
).astype(np.float32)

node = onnx.helper.make_node(
    "Col2Im",
    ["input", "image_shape", "block_shape"],
    ["output"],
    strides=[2, 2],
)
expect(
    node,
    inputs=[input, image_shape, block_shape],
    outputs=[output],
    name="test_col2im_strides",
)
```

</details>


### <a name="Compress"></a><a name="compress">**Compress**</a>

  Selects slices from an input tensor along a given axis where condition evaluates to True for each axis index.
      In case axis is not provided, input is flattened before elements are selected.
      Compress behaves like numpy.compress: https://docs.scipy.org/doc/numpy/reference/generated/numpy.compress.html


#### Version

This version of the operator has been available since version 11 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Compress-9">9</a>

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>(Optional) Axis along which to take slices. If not specified, input is flattened before elements being selected. Negative value means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(input).</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>Tensor of rank r >= 1.</dd>
<dt><tt>condition</tt> (non-differentiable) : T1</dt>
<dd>Rank 1 tensor of booleans to indicate which slices or data elements to be selected. Its length can be less than the input length along the axis or the flattened input size if axis is not specified. In such cases data slices or elements exceeding the condition length are discarded.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>Tensor of rank r if axis is specified. Otherwise output is a Tensor of rank 1.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain input and output types to all tensor types.</dd>
<dt><tt>T1</tt> : tensor(bool)</dt>
<dd>Constrain to boolean tensors.</dd>
</dl>


#### Examples

<details>
<summary>compress_0</summary>

```python
node = onnx.helper.make_node(
    "Compress",
    inputs=["input", "condition"],
    outputs=["output"],
    axis=0,
)
input = np.array([[1, 2], [3, 4], [5, 6]]).astype(np.float32)
condition = np.array([0, 1, 1])
output = np.compress(condition, input, axis=0)
# print(output)
# [[ 3.  4.]
# [ 5.  6.]]

expect(
    node,
    inputs=[input, condition.astype(bool)],
    outputs=[output],
    name="test_compress_0",
)
```

</details>


<details>
<summary>compress_1</summary>

```python
node = onnx.helper.make_node(
    "Compress",
    inputs=["input", "condition"],
    outputs=["output"],
    axis=1,
)
input = np.array([[1, 2], [3, 4], [5, 6]]).astype(np.float32)
condition = np.array([0, 1])
output = np.compress(condition, input, axis=1)
# print(output)
# [[ 2.]
# [ 4.]
# [ 6.]]

expect(
    node,
    inputs=[input, condition.astype(bool)],
    outputs=[output],
    name="test_compress_1",
)
```

</details>


<details>
<summary>compress_default_axis</summary>

```python
node = onnx.helper.make_node(
    "Compress",
    inputs=["input", "condition"],
    outputs=["output"],
)
input = np.array([[1, 2], [3, 4], [5, 6]]).astype(np.float32)
condition = np.array([0, 1, 0, 0, 1])
output = np.compress(condition, input)
# print(output)
# [ 2., 5.]

expect(
    node,
    inputs=[input, condition.astype(bool)],
    outputs=[output],
    name="test_compress_default_axis",
)
```

</details>


<details>
<summary>compress_negative_axis</summary>

```python
node = onnx.helper.make_node(
    "Compress",
    inputs=["input", "condition"],
    outputs=["output"],
    axis=-1,
)
input = np.array([[1, 2], [3, 4], [5, 6]]).astype(np.float32)
condition = np.array([0, 1])
output = np.compress(condition, input, axis=-1)
# print(output)
# [[ 2.]
# [ 4.]
# [ 6.]]
expect(
    node,
    inputs=[input, condition.astype(bool)],
    outputs=[output],
    name="test_compress_negative_axis",
)
```

</details>


### <a name="Concat"></a><a name="concat">**Concat**</a>

  Concatenate a list of tensors into a single tensor. All input tensors must have the same shape, except for the dimension size of the axis to concatenate on.

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Concat-1">1</a>, <a href="Changelog.md#Concat-4">4</a>, <a href="Changelog.md#Concat-11">11</a>

#### Attributes

<dl>
<dt><tt>axis</tt> : int (required)</dt>
<dd>Which axis to concat on. A negative value means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(inputs)..</dd>
</dl>

#### Inputs (1 - &#8734;)

<dl>
<dt><tt>inputs</tt> (variadic, differentiable) : T</dt>
<dd>List of tensors for concatenation</dd>
</dl>

#### Outputs

<dl>
<dt><tt>concat_result</tt> (differentiable) : T</dt>
<dd>Concatenated tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain output types to any tensor type.</dd>
</dl>


#### Examples

<details>
<summary>concat</summary>

```python
test_cases: Dict[str, Sequence[Any]] = {
    "1d": ([1, 2], [3, 4]),
    "2d": ([[1, 2], [3, 4]], [[5, 6], [7, 8]]),
    "3d": (
        [[[1, 2], [3, 4]], [[5, 6], [7, 8]]],
        [[[9, 10], [11, 12]], [[13, 14], [15, 16]]],
    ),
}

for test_case, values_ in test_cases.items():
    values = [np.asarray(v, dtype=np.float32) for v in values_]
    for i in range(len(values[0].shape)):
        in_args = ["value" + str(k) for k in range(len(values))]
        node = onnx.helper.make_node(
            "Concat", inputs=list(in_args), outputs=["output"], axis=i
        )
        output = np.concatenate(values, i)
        expect(
            node,
            inputs=list(values),
            outputs=[output],
            name="test_concat_" + test_case + "_axis_" + str(i),
        )

    for i in range(-len(values[0].shape), 0):
        in_args = ["value" + str(k) for k in range(len(values))]
        node = onnx.helper.make_node(
            "Concat", inputs=list(in_args), outputs=["output"], axis=i
        )
        output = np.concatenate(values, i)
        expect(
            node,
            inputs=list(values),
            outputs=[output],
            name="test_concat_" + test_case + "_axis_negative_" + str(abs(i)),
        )
```

</details>


### <a name="ConcatFromSequence"></a><a name="concatfromsequence">**ConcatFromSequence**</a>

  Concatenate a sequence of tensors into a single tensor.
  All input tensors must have the same shape, except for the dimension size of the axis to concatenate on.
  By default 'new_axis' is 0, the behavior is similar to numpy.concatenate.
  When 'new_axis' is 1, the behavior is similar to numpy.stack.

#### Version

This version of the operator has been available since version 11 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int (required)</dt>
<dd>Which axis to concat on. Accepted range in `[-r, r - 1]`, where `r` is the rank of input tensors. When `new_axis` is 1, accepted range is `[-r - 1, r]`. </dd>
<dt><tt>new_axis</tt> : int (default is 0)</dt>
<dd>Insert and concatenate on a new axis or not, default 0 means do not insert new axis.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input_sequence</tt> : S</dt>
<dd>Sequence of tensors for concatenation</dd>
</dl>

#### Outputs

<dl>
<dt><tt>concat_result</tt> : T</dt>
<dd>Concatenated tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>S</tt> : seq(tensor(uint8)), seq(tensor(uint16)), seq(tensor(uint32)), seq(tensor(uint64)), seq(tensor(int8)), seq(tensor(int16)), seq(tensor(int32)), seq(tensor(int64)), seq(tensor(float16)), seq(tensor(float)), seq(tensor(double)), seq(tensor(string)), seq(tensor(bool)), seq(tensor(complex64)), seq(tensor(complex128))</dt>
<dd>Constrain input types to any tensor type.</dd>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain output types to any tensor type.</dd>
</dl>


### <a name="Constant"></a><a name="constant">**Constant**</a>

  This operator produces a constant tensor. Exactly one of the provided attributes, either value, sparse_value,
  or value_* must be specified.

#### Version

This version of the operator has been available since version 21 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Constant-1">1</a>, <a href="Changelog.md#Constant-9">9</a>, <a href="Changelog.md#Constant-11">11</a>, <a href="Changelog.md#Constant-12">12</a>, <a href="Changelog.md#Constant-13">13</a>, <a href="Changelog.md#Constant-19">19</a>

#### Attributes

<dl>
<dt><tt>sparse_value</tt> : sparse_tensor</dt>
<dd>The value for the elements of the output tensor in sparse format.</dd>
<dt><tt>value</tt> : tensor</dt>
<dd>The value for the elements of the output tensor.</dd>
<dt><tt>value_float</tt> : float</dt>
<dd>The value for the sole element for the scalar, float32, output tensor.</dd>
<dt><tt>value_floats</tt> : list of floats</dt>
<dd>The values for the elements for the 1D, float32, output tensor.</dd>
<dt><tt>value_int</tt> : int</dt>
<dd>The value for the sole element for the scalar, int64, output tensor.</dd>
<dt><tt>value_ints</tt> : list of ints</dt>
<dd>The values for the elements for the 1D, int64, output tensor.</dd>
<dt><tt>value_string</tt> : string</dt>
<dd>The value for the sole element for the scalar, UTF-8 string, output tensor.</dd>
<dt><tt>value_strings</tt> : list of strings</dt>
<dd>The values for the elements for the 1D, UTF-8 string, output tensor.</dd>
</dl>

#### Inputs


#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>Output tensor containing the same value of the provided tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4)</dt>
<dd>Constrain input and output types to all tensor types.</dd>
</dl>


#### Examples

<details>
<summary>constant</summary>

```python
values = np.random.randn(5, 5).astype(np.float32)
node = onnx.helper.make_node(
    "Constant",
    inputs=[],
    outputs=["values"],
    value=onnx.helper.make_tensor(
        name="const_tensor",
        data_type=onnx.TensorProto.FLOAT,
        dims=values.shape,
        vals=values.flatten().astype(float),
    ),
)

expect(node, inputs=[], outputs=[values], name="test_constant")
```

</details>


### <a name="ConstantOfShape"></a><a name="constantofshape">**ConstantOfShape**</a>

  Generate a tensor with given value and shape.

#### Version

This version of the operator has been available since version 21 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#ConstantOfShape-9">9</a>, <a href="Changelog.md#ConstantOfShape-20">20</a>

#### Attributes

<dl>
<dt><tt>value</tt> : tensor</dt>
<dd>(Optional) The value of the output elements.Should be a one-element tensor. If not specified, it defaults to a tensor of value 0 and datatype float32</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T1</dt>
<dd>1D tensor. The shape of the expected output tensor. If empty tensor is given, the output would be a scalar. All values must be >= 0.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T2</dt>
<dd>Output tensor of shape specified by 'input'.If attribute 'value' is specified, the value and datatype of the output tensor is taken from 'value'.If attribute 'value' is not specified, the value in the output defaults to 0, and the datatype defaults to float32.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(int64)</dt>
<dd>Constrain input types.</dd>
<dt><tt>T2</tt> : tensor(float16), tensor(float), tensor(double), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(uint4), tensor(int4), tensor(bool), tensor(bfloat16), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz)</dt>
<dd>Constrain output types to be numerics or boolean.</dd>
</dl>


#### Examples

<details>
<summary>float_ones</summary>

```python
x = np.array([4, 3, 2]).astype(np.int64)
tensor_value = onnx.helper.make_tensor(
    "value", onnx.TensorProto.FLOAT, [1], [1]
)
node = onnx.helper.make_node(
    "ConstantOfShape",
    inputs=["x"],
    outputs=["y"],
    value=tensor_value,
)

y = np.ones(x, dtype=np.float32)
expect(node, inputs=[x], outputs=[y], name="test_constantofshape_float_ones")
```

</details>


<details>
<summary>int32_shape_zero</summary>

```python
x = np.array(
    [
        0,
    ]
).astype(np.int64)
tensor_value = onnx.helper.make_tensor(
    "value", onnx.TensorProto.INT32, [1], [0]
)
node = onnx.helper.make_node(
    "ConstantOfShape",
    inputs=["x"],
    outputs=["y"],
    value=tensor_value,
)
y = np.zeros(x, dtype=np.int32)
expect(
    node, inputs=[x], outputs=[y], name="test_constantofshape_int_shape_zero"
)
```

</details>


<details>
<summary>int32_zeros</summary>

```python
x = np.array([10, 6]).astype(np.int64)
tensor_value = onnx.helper.make_tensor(
    "value", onnx.TensorProto.INT32, [1], [0]
)
node = onnx.helper.make_node(
    "ConstantOfShape",
    inputs=["x"],
    outputs=["y"],
    value=tensor_value,
)
y = np.zeros(x, dtype=np.int32)
expect(node, inputs=[x], outputs=[y], name="test_constantofshape_int_zeros")
```

</details>


### <a name="Conv"></a><a name="conv">**Conv**</a>

  The convolution operator consumes an input tensor and a filter, and
  computes the output.

#### Version

This version of the operator has been available since version 11 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Conv-1">1</a>

#### Attributes

<dl>
<dt><tt>auto_pad</tt> : string (default is NOTSET)</dt>
<dd>auto_pad must be either NOTSET, SAME_UPPER, SAME_LOWER or VALID. Where default value is NOTSET, which means explicit padding is used. SAME_UPPER or SAME_LOWER mean pad the input so that `output_shape[i] = ceil(input_shape[i] / strides[i])` for each axis `i`. The padding is split between the two sides equally or almost equally (depending on whether it is even or odd). In case the padding is an odd number, the extra padding is added at the end for SAME_UPPER and at the beginning for SAME_LOWER.</dd>
<dt><tt>dilations</tt> : list of ints</dt>
<dd>dilation value along each spatial axis of the filter. If not present, the dilation defaults is 1 along each spatial axis.</dd>
<dt><tt>group</tt> : int (default is 1)</dt>
<dd>number of groups input channels and output channels are divided into.</dd>
<dt><tt>kernel_shape</tt> : list of ints</dt>
<dd>The shape of the convolution kernel. If not present, should be inferred from input W.</dd>
<dt><tt>pads</tt> : list of ints</dt>
<dd>Padding for the beginning and ending along each spatial axis, it can take any value greater than or equal to 0. The value represent the number of pixels added to the beginning and end part of the corresponding axis. `pads` format should be as follow [x1_begin, x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels added at the beginning of axis `i` and xi_end, the number of pixels added at the end of axis `i`. This attribute cannot be used simultaneously with auto_pad attribute. If not present, the padding defaults to 0 along start and end of each spatial axis.</dd>
<dt><tt>strides</tt> : list of ints</dt>
<dd>Stride along each spatial axis. If not present, the stride defaults is 1 along each spatial axis.</dd>
</dl>

#### Inputs (2 - 3)

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input data tensor from previous layer; has size (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and width. Note that this is for the 2D image. Otherwise the size is (N x C x D1 x D2 ... x Dn). Optionally, if dimension denotation is in effect, the operation expects input data tensor to arrive with the dimension denotation of [DATA_BATCH, DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE ...].</dd>
<dt><tt>W</tt> (differentiable) : T</dt>
<dd>The weight tensor that will be used in the convolutions; has size (M x C/group x kH x kW), where C is the number of channels, and kH and kW are the height and width of the kernel, and M is the number of feature maps. For more than 2 dimensions, the kernel shape will be (M x C/group x k1 x k2 x ... x kn), where (k1 x k2 x ... kn) is the dimension of the kernel. Optionally, if dimension denotation is in effect, the operation expects the weight tensor to arrive with the dimension denotation of [FILTER_OUT_CHANNEL, FILTER_IN_CHANNEL, FILTER_SPATIAL, FILTER_SPATIAL ...]. Assuming zero based indices for the shape array, X.shape[1] == (W.shape[1] * group) == C and W.shape[0] mod G == 0. Or in other words FILTER_IN_CHANNEL multiplied by the number of groups should be equal to DATA_CHANNEL and the number of feature maps M should be a multiple of the number of groups G.</dd>
<dt><tt>B</tt> (optional, differentiable) : T</dt>
<dd>Optional 1D bias to be added to the convolution, has size of M.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>Output data tensor that contains the result of the convolution. The output dimensions are functions of the kernel size, stride size, and pad lengths.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>conv</summary>

```python
x = np.array(
    [
        [
            [
                [0.0, 1.0, 2.0, 3.0, 4.0],  # (1, 1, 5, 5) input tensor
                [5.0, 6.0, 7.0, 8.0, 9.0],
                [10.0, 11.0, 12.0, 13.0, 14.0],
                [15.0, 16.0, 17.0, 18.0, 19.0],
                [20.0, 21.0, 22.0, 23.0, 24.0],
            ]
        ]
    ]
).astype(np.float32)
W = np.array(
    [
        [
            [
                [1.0, 1.0, 1.0],  # (1, 1, 3, 3) tensor for convolution weights
                [1.0, 1.0, 1.0],
                [1.0, 1.0, 1.0],
            ]
        ]
    ]
).astype(np.float32)

# Convolution with padding
node_with_padding = onnx.helper.make_node(
    "Conv",
    inputs=["x", "W"],
    outputs=["y"],
    kernel_shape=[3, 3],
    # Default values for other attributes: strides=[1, 1], dilations=[1, 1], groups=1
    pads=[1, 1, 1, 1],
)
y_with_padding = np.array(
    [
        [
            [
                [12.0, 21.0, 27.0, 33.0, 24.0],  # (1, 1, 5, 5) output tensor
                [33.0, 54.0, 63.0, 72.0, 51.0],
                [63.0, 99.0, 108.0, 117.0, 81.0],
                [93.0, 144.0, 153.0, 162.0, 111.0],
                [72.0, 111.0, 117.0, 123.0, 84.0],
            ]
        ]
    ]
).astype(np.float32)
expect(
    node_with_padding,
    inputs=[x, W],
    outputs=[y_with_padding],
    name="test_basic_conv_with_padding",
)

# Convolution without padding
node_without_padding = onnx.helper.make_node(
    "Conv",
    inputs=["x", "W"],
    outputs=["y"],
    kernel_shape=[3, 3],
    # Default values for other attributes: strides=[1, 1], dilations=[1, 1], groups=1
    pads=[0, 0, 0, 0],
)
y_without_padding = np.array(
    [
        [
            [
                [54.0, 63.0, 72.0],  # (1, 1, 3, 3) output tensor
                [99.0, 108.0, 117.0],
                [144.0, 153.0, 162.0],
            ]
        ]
    ]
).astype(np.float32)
expect(
    node_without_padding,
    inputs=[x, W],
    outputs=[y_without_padding],
    name="test_basic_conv_without_padding",
)
```

</details>


<details>
<summary>conv_with_autopad_same</summary>

```python
x = np.array(
    [
        [
            [
                [0.0, 1.0, 2.0, 3.0, 4.0],  # (1, 1, 5, 5) input tensor
                [5.0, 6.0, 7.0, 8.0, 9.0],
                [10.0, 11.0, 12.0, 13.0, 14.0],
                [15.0, 16.0, 17.0, 18.0, 19.0],
                [20.0, 21.0, 22.0, 23.0, 24.0],
            ]
        ]
    ]
).astype(np.float32)
W = np.array(
    [
        [
            [
                [1.0, 1.0, 1.0],  # (1, 1, 3, 3) tensor for convolution weights
                [1.0, 1.0, 1.0],
                [1.0, 1.0, 1.0],
            ]
        ]
    ]
).astype(np.float32)

# Convolution with auto_pad='SAME_LOWER' and strides=2
node = onnx.helper.make_node(
    "Conv",
    inputs=["x", "W"],
    outputs=["y"],
    auto_pad="SAME_LOWER",
    kernel_shape=[3, 3],
    strides=[2, 2],
)
y = np.array(
    [[[[12.0, 27.0, 24.0], [63.0, 108.0, 81.0], [72.0, 117.0, 84.0]]]]
).astype(np.float32)
expect(node, inputs=[x, W], outputs=[y], name="test_conv_with_autopad_same")
```

</details>


<details>
<summary>conv_with_strides</summary>

```python
x = np.array(
    [
        [
            [
                [0.0, 1.0, 2.0, 3.0, 4.0],  # (1, 1, 7, 5) input tensor
                [5.0, 6.0, 7.0, 8.0, 9.0],
                [10.0, 11.0, 12.0, 13.0, 14.0],
                [15.0, 16.0, 17.0, 18.0, 19.0],
                [20.0, 21.0, 22.0, 23.0, 24.0],
                [25.0, 26.0, 27.0, 28.0, 29.0],
                [30.0, 31.0, 32.0, 33.0, 34.0],
            ]
        ]
    ]
).astype(np.float32)
W = np.array(
    [
        [
            [
                [1.0, 1.0, 1.0],  # (1, 1, 3, 3) tensor for convolution weights
                [1.0, 1.0, 1.0],
                [1.0, 1.0, 1.0],
            ]
        ]
    ]
).astype(np.float32)

# Convolution with strides=2 and padding
node_with_padding = onnx.helper.make_node(
    "Conv",
    inputs=["x", "W"],
    outputs=["y"],
    kernel_shape=[3, 3],
    pads=[1, 1, 1, 1],
    strides=[
        2,
        2,
    ],  # Default values for other attributes: dilations=[1, 1], groups=1
)
y_with_padding = np.array(
    [
        [
            [
                [12.0, 27.0, 24.0],  # (1, 1, 4, 3) output tensor
                [63.0, 108.0, 81.0],
                [123.0, 198.0, 141.0],
                [112.0, 177.0, 124.0],
            ]
        ]
    ]
).astype(np.float32)
expect(
    node_with_padding,
    inputs=[x, W],
    outputs=[y_with_padding],
    name="test_conv_with_strides_padding",
)

# Convolution with strides=2 and no padding
node_without_padding = onnx.helper.make_node(
    "Conv",
    inputs=["x", "W"],
    outputs=["y"],
    kernel_shape=[3, 3],
    pads=[0, 0, 0, 0],
    strides=[
        2,
        2,
    ],  # Default values for other attributes: dilations=[1, 1], groups=1
)
y_without_padding = np.array(
    [
        [
            [
                [54.0, 72.0],  # (1, 1, 3, 2) output tensor
                [144.0, 162.0],
                [234.0, 252.0],
            ]
        ]
    ]
).astype(np.float32)
expect(
    node_without_padding,
    inputs=[x, W],
    outputs=[y_without_padding],
    name="test_conv_with_strides_no_padding",
)

# Convolution with strides=2 and padding only along one dimension (the H dimension in NxCxHxW tensor)
node_with_asymmetric_padding = onnx.helper.make_node(
    "Conv",
    inputs=["x", "W"],
    outputs=["y"],
    kernel_shape=[3, 3],
    pads=[1, 0, 1, 0],
    strides=[
        2,
        2,
    ],  # Default values for other attributes: dilations=[1, 1], groups=1
)
y_with_asymmetric_padding = np.array(
    [
        [
            [
                [21.0, 33.0],  # (1, 1, 4, 2) output tensor
                [99.0, 117.0],
                [189.0, 207.0],
                [171.0, 183.0],
            ]
        ]
    ]
).astype(np.float32)
expect(
    node_with_asymmetric_padding,
    inputs=[x, W],
    outputs=[y_with_asymmetric_padding],
    name="test_conv_with_strides_and_asymmetric_padding",
)
```

</details>


### <a name="ConvInteger"></a><a name="convinteger">**ConvInteger**</a>

  The integer convolution operator consumes an input tensor, its zero-point, a filter, and its zero-point,
  and computes the output. The production MUST never overflow. The accumulation may overflow if and only if in 32 bits.

#### Version

This version of the operator has been available since version 10 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>auto_pad</tt> : string (default is NOTSET)</dt>
<dd>auto_pad must be either NOTSET, SAME_UPPER, SAME_LOWER or VALID. Where default value is NOTSET, which means explicit padding is used. SAME_UPPER or SAME_LOWER mean pad the input so that `output_shape[i] = ceil(input_shape[i] / strides[i])` for each axis `i`. The padding is split between the two sides equally or almost equally (depending on whether it is even or odd). In case the padding is an odd number, the extra padding is added at the end for SAME_UPPER and at the beginning for SAME_LOWER.</dd>
<dt><tt>dilations</tt> : list of ints</dt>
<dd>dilation value along each spatial axis of the filter. If not present, the dilation defaults to 1 along each axis.</dd>
<dt><tt>group</tt> : int (default is 1)</dt>
<dd>number of groups input channels and output channels are divided into. default is 1.</dd>
<dt><tt>kernel_shape</tt> : list of ints</dt>
<dd>The shape of the convolution kernel. If not present, should be inferred from input 'w'.</dd>
<dt><tt>pads</tt> : list of ints</dt>
<dd>Padding for the beginning and ending along each spatial axis, it can take any value greater than or equal to 0.The value represent the number of pixels added to the beginning and end part of the corresponding axis.`pads` format should be as follow [x1_begin, x2_begin...x1_end, x2_end,...], where xi_begin the number ofpixels added at the beginning of axis `i` and xi_end, the number of pixels added at the end of axis `i`.This attribute cannot be used simultaneously with auto_pad attribute. If not present, the padding defaultsto 0 along start and end of each spatial axis.</dd>
<dt><tt>strides</tt> : list of ints</dt>
<dd>Stride along each spatial axis. If not present, the stride defaults to 1 along each axis.</dd>
</dl>

#### Inputs (2 - 4)

<dl>
<dt><tt>x</tt> : T1</dt>
<dd>Input data tensor from previous layer; has size (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and width. Note that this is for the 2D image. Otherwise the size is (N x C x D1 x D2 ... x Dn). Optionally, if dimension denotation is in effect, the operation expects input data tensor to arrive with the dimension denotation of [DATA_BATCH, DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE ...].</dd>
<dt><tt>w</tt> : T2</dt>
<dd>The weight tensor that will be used in the convolutions; has size (M x C/group x kH x kW), where C is the number of channels, and kH and kW are the height and width of the kernel, and M is the number of feature maps. For more than 2 dimensions, the kernel shape will be (M x C/group x k1 x k2 x ... x kn), where (k1 x k2 x ... kn) is the dimension of the kernel. Optionally, if dimension denotation is in effect, the operation expects the weight tensor to arrive with the dimension denotation of [FILTER_OUT_CHANNEL, FILTER_IN_CHANNEL, FILTER_SPATIAL, FILTER_SPATIAL ...]. X.shape[1] == (W.shape[1] * group) == C (assuming zero based indices for the shape array). Or in other words FILTER_IN_CHANNEL should be equal to DATA_CHANNEL. </dd>
<dt><tt>x_zero_point</tt> (optional) : T1</dt>
<dd>Zero point tensor for input 'x'. It's optional and default value is 0. It's a scalar, which means a per-tensor/layer quantization.</dd>
<dt><tt>w_zero_point</tt> (optional) : T2</dt>
<dd>Zero point tensor for input 'w'. It's optional and default value is 0.  It could be a scalar or a 1-D tensor, which means a per-tensor/layer or per output channel quantization. If it's a 1-D tensor, its number of elements should be equal to the number of output channels (M)</dd>
</dl>

#### Outputs

<dl>
<dt><tt>y</tt> : T3</dt>
<dd>Output data tensor that contains the result of the convolution. The output dimensions are functions of the kernel size, stride size, and pad lengths.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(int8), tensor(uint8)</dt>
<dd>Constrain input x and its zero point data type to 8-bit integer tensor.</dd>
<dt><tt>T2</tt> : tensor(int8), tensor(uint8)</dt>
<dd>Constrain input w and its zero point data type to 8-bit integer tensor.</dd>
<dt><tt>T3</tt> : tensor(int32)</dt>
<dd>Constrain output y data type to 32-bit integer tensor.</dd>
</dl>


#### Examples

<details>
<summary>with_padding</summary>

```python
x = (
    np.array([2, 3, 4, 5, 6, 7, 8, 9, 10])
    .astype(np.uint8)
    .reshape((1, 1, 3, 3))
)
x_zero_point = np.uint8(1)
w = np.array([1, 1, 1, 1]).astype(np.uint8).reshape((1, 1, 2, 2))

y = (
    np.array([1, 3, 5, 3, 5, 12, 16, 9, 11, 24, 28, 15, 7, 15, 17, 9])
    .astype(np.int32)
    .reshape((1, 1, 4, 4))
)

# ConvInteger with padding
convinteger_node_with_padding = onnx.helper.make_node(
    "ConvInteger",
    inputs=["x", "w", "x_zero_point"],
    outputs=["y"],
    pads=[1, 1, 1, 1],
)

expect(
    convinteger_node_with_padding,
    inputs=[x, w, x_zero_point],
    outputs=[y],
    name="test_convinteger_with_padding",
)
```

</details>


<details>
<summary>without_padding</summary>

```python
x = (
    np.array([2, 3, 4, 5, 6, 7, 8, 9, 10])
    .astype(np.uint8)
    .reshape((1, 1, 3, 3))
)
x_zero_point = np.uint8(1)
w = np.array([1, 1, 1, 1]).astype(np.uint8).reshape((1, 1, 2, 2))

y = np.array([12, 16, 24, 28]).astype(np.int32).reshape(1, 1, 2, 2)

# ConvInteger without padding
convinteger_node = onnx.helper.make_node(
    "ConvInteger", inputs=["x", "w", "x_zero_point"], outputs=["y"]
)

expect(
    convinteger_node,
    inputs=[x, w, x_zero_point],
    outputs=[y],
    name="test_convinteger_without_padding",
)
```

</details>


### <a name="ConvTranspose"></a><a name="convtranspose">**ConvTranspose**</a>

  The convolution transpose operator consumes an input tensor and a filter,
  and computes the output.

  If the pads parameter is provided the shape of the output is calculated via the following equation:

    output_shape[i] = stride[i] * (input_size[i] - 1) + output_padding[i] + ((kernel_shape[i] - 1) * dilations[i] + 1) - pads[start_i] - pads[end_i]

  output_shape can also be explicitly specified in which case pads values are auto generated using these equations:

    total_padding[i] = stride[i] * (input_size[i] - 1) + output_padding[i] + ((kernel_shape[i] - 1) * dilations[i] + 1) - output_shape[i]
    If (auto_pads == SAME_UPPER): pads[start_i] = total_padding[i]/2; pads[end_i] = total_padding[i] - (total_padding[i]/2)
    Else: pads[start_i] = total_padding[i] - (total_padding[i]/2); pads[end_i] = (total_padding[i]/2).



#### Version

This version of the operator has been available since version 11 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#ConvTranspose-1">1</a>

#### Attributes

<dl>
<dt><tt>auto_pad</tt> : string (default is NOTSET)</dt>
<dd>auto_pad must be either NOTSET, SAME_UPPER, SAME_LOWER or VALID. Where default value is NOTSET, which means explicit padding is used. SAME_UPPER or SAME_LOWER mean pad the input so that `output_shape[i] = input_shape[i] * strides[i]` for each axis `i`. The padding is split between the two sides equally or almost equally (depending on whether it is even or odd). In case the padding is an odd number, the extra padding is added at the end for SAME_UPPER and at the beginning for SAME_LOWER.</dd>
<dt><tt>dilations</tt> : list of ints</dt>
<dd>dilation value along each spatial axis of the filter. If not present, the dilation defaults to 1 along each spatial axis.</dd>
<dt><tt>group</tt> : int (default is 1)</dt>
<dd>number of groups input channels and output channels are divided into.</dd>
<dt><tt>kernel_shape</tt> : list of ints</dt>
<dd>The shape of the convolution kernel. If not present, should be inferred from input W.</dd>
<dt><tt>output_padding</tt> : list of ints</dt>
<dd>Additional elements added to the side with higher coordinate indices in the output. Each padding value in "output_padding" must be less than the corresponding stride/dilation dimension. By default, this attribute is a zero vector. Note that this attribute doesn't directly affect the computed output values. It only controls the selection of the computed values, so changing this attribute only adds or removes output elements. If "output_shape" is explicitly provided, "output_padding" does not contribute additional size to "output_shape" but participates in the computation of the needed padding amount. This is also called adjs or adjustment in some frameworks.</dd>
<dt><tt>output_shape</tt> : list of ints</dt>
<dd>The shape of the output can be explicitly set which will cause pads values to be auto generated. If output_shape is specified pads values are ignored. See doc for details for equations to generate pads. Note that the output_shape attribute value should not include dimensions for batch size and channels, which are automatically inferred.</dd>
<dt><tt>pads</tt> : list of ints</dt>
<dd>Padding for the beginning and ending along each spatial axis, it can take any value greater than or equal to 0. The value represent the number of pixels added to the beginning and end part of the corresponding axis. `pads` format should be as follow [x1_begin, x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels added at the beginning of axis `i` and xi_end, the number of pixels added at the end of axis `i`. This attribute cannot be used simultaneously with auto_pad attribute. If not present, the padding defaults to 0 along start and end of each spatial axis.</dd>
<dt><tt>strides</tt> : list of ints</dt>
<dd>Stride along each spatial axis. If not present, the stride defaults to 1 along each spatial axis.</dd>
</dl>

#### Inputs (2 - 3)

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input data tensor from previous layer; has size (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and width. Note that this is for the 2D image. Otherwise the size is (N x C x D1 x D2 ... x Dn)</dd>
<dt><tt>W</tt> (differentiable) : T</dt>
<dd>The weight tensor that will be used in the convolutions; has size (C x M/group x kH x kW), where C is the number of channels, and kH and kW are the height and width of the kernel, and M is the number of feature maps. For more than 2 dimensions, the weight shape will be (C x M/group x k1 x k2 x ... x kn), where (k1 x k2 x ... x kn) is the dimension of the kernel. The number of channels in the output should be equal to W.shape[1] * group (assuming zero based indices of the shape array)</dd>
<dt><tt>B</tt> (optional, differentiable) : T</dt>
<dd>Optional 1D bias to be added to the convolution, has size of M.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>Output data tensor that contains the result of the convolution. The output dimensions are functions of the kernel size, stride size, pad lengths and group count. The number of channels in the output should be equal to W.shape[1] * group (assuming zero based indices of the shape array)</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>convtranspose</summary>

```python
x = np.array(
    [[[[0.0, 1.0, 2.0], [3.0, 4.0, 5.0], [6.0, 7.0, 8.0]]]]  # (1, 1, 3, 3)
).astype(np.float32)

W = np.array(
    [
        [
            [[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]],  # (1, 2, 3, 3)
            [[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]],
        ]
    ]
).astype(np.float32)

node = onnx.helper.make_node("ConvTranspose", ["X", "W"], ["Y"])

y = np.array(
    [
        [
            [
                [0.0, 1.0, 3.0, 3.0, 2.0],  # (1, 2, 5, 5)
                [3.0, 8.0, 15.0, 12.0, 7.0],
                [9.0, 21.0, 36.0, 27.0, 15.0],
                [9.0, 20.0, 33.0, 24.0, 13.0],
                [6.0, 13.0, 21.0, 15.0, 8.0],
            ],
            [
                [0.0, 1.0, 3.0, 3.0, 2.0],
                [3.0, 8.0, 15.0, 12.0, 7.0],
                [9.0, 21.0, 36.0, 27.0, 15.0],
                [9.0, 20.0, 33.0, 24.0, 13.0],
                [6.0, 13.0, 21.0, 15.0, 8.0],
            ],
        ]
    ]
).astype(np.float32)

expect(node, inputs=[x, W], outputs=[y], name="test_convtranspose")
```

</details>


<details>
<summary>convtranspose_1d</summary>

```python
x = np.array([[[0.0, 1.0, 2.0]]]).astype(np.float32)  # (1, 1, 3)

W = np.array([[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0]]]).astype(  # (1, 2, 3)
    np.float32
)

node = onnx.helper.make_node("ConvTranspose", ["X", "W"], ["Y"])

y = np.array(
    [[[0.0, 1.0, 3.0, 3.0, 2.0], [0.0, 1.0, 3.0, 3.0, 2.0]]]  # (1, 2, 5)
).astype(np.float32)

expect(node, inputs=[x, W], outputs=[y], name="test_convtranspose_1d")
```

</details>


<details>
<summary>convtranspose_3d</summary>

```python
x = np.array(
    [
        [
            [
                [
                    [0.0, 1.0, 2.0, 3.0, 4.0],  # (1, 1, 3, 4, 5)
                    [5.0, 6.0, 7.0, 8.0, 9.0],
                    [10.0, 11.0, 12.0, 13.0, 14.0],
                    [15.0, 16.0, 17.0, 18.0, 19.0],
                ],
                [
                    [20.0, 21.0, 22.0, 23.0, 24.0],
                    [25.0, 26.0, 27.0, 28.0, 29.0],
                    [30.0, 31.0, 32.0, 33.0, 34.0],
                    [35.0, 36.0, 37.0, 38.0, 39.0],
                ],
                [
                    [40.0, 41.0, 42.0, 43.0, 44.0],
                    [45.0, 46.0, 47.0, 48.0, 49.0],
                    [50.0, 51.0, 52.0, 53.0, 54.0],
                    [55.0, 56.0, 57.0, 58.0, 59.0],
                ],
            ]
        ]
    ]
).astype(np.float32)

W = np.array(
    [
        [
            [
                [
                    [1.0, 1.0, 1.0],  # (1, 2, 3, 3, 3)
                    [1.0, 1.0, 1.0],
                    [1.0, 1.0, 1.0],
                ],
                [[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]],
                [[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]],
            ],
            [
                [[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]],
                [[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]],
                [[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]],
            ],
        ]
    ]
).astype(np.float32)

node = onnx.helper.make_node("ConvTranspose", ["X", "W"], ["Y"])

y = np.array(
    [
        [
            [
                [
                    [0.0, 1.0, 3.0, 6.0, 9.0, 7.0, 4.0],  # (1, 2, 5, 6, 7)
                    [5.0, 12.0, 21.0, 27.0, 33.0, 24.0, 13.0],
                    [15.0, 33.0, 54.0, 63.0, 72.0, 51.0, 27.0],
                    [30.0, 63.0, 99.0, 108.0, 117.0, 81.0, 42.0],
                    [25.0, 52.0, 81.0, 87.0, 93.0, 64.0, 33.0],
                    [15.0, 31.0, 48.0, 51.0, 54.0, 37.0, 19.0],
                ],
                [
                    [20.0, 42.0, 66.0, 72.0, 78.0, 54.0, 28.0],
                    [50.0, 104.0, 162.0, 174.0, 186.0, 128.0, 66.0],
                    [90.0, 186.0, 288.0, 306.0, 324.0, 222.0, 114.0],
                    [120.0, 246.0, 378.0, 396.0, 414.0, 282.0, 144.0],
                    [90.0, 184.0, 282.0, 294.0, 306.0, 208.0, 106.0],
                    [50.0, 102.0, 156.0, 162.0, 168.0, 114.0, 58.0],
                ],
                [
                    [60.0, 123.0, 189.0, 198.0, 207.0, 141.0, 72.0],
                    [135.0, 276.0, 423.0, 441.0, 459.0, 312.0, 159.0],
                    [225.0, 459.0, 702.0, 729.0, 756.0, 513.0, 261.0],
                    [270.0, 549.0, 837.0, 864.0, 891.0, 603.0, 306.0],
                    [195.0, 396.0, 603.0, 621.0, 639.0, 432.0, 219.0],
                    [105.0, 213.0, 324.0, 333.0, 342.0, 231.0, 117.0],
                ],
                [
                    [60.0, 122.0, 186.0, 192.0, 198.0, 134.0, 68.0],
                    [130.0, 264.0, 402.0, 414.0, 426.0, 288.0, 146.0],
                    [210.0, 426.0, 648.0, 666.0, 684.0, 462.0, 234.0],
                    [240.0, 486.0, 738.0, 756.0, 774.0, 522.0, 264.0],
                    [170.0, 344.0, 522.0, 534.0, 546.0, 368.0, 186.0],
                    [90.0, 182.0, 276.0, 282.0, 288.0, 194.0, 98.0],
                ],
                [
                    [40.0, 81.0, 123.0, 126.0, 129.0, 87.0, 44.0],
                    [85.0, 172.0, 261.0, 267.0, 273.0, 184.0, 93.0],
                    [135.0, 273.0, 414.0, 423.0, 432.0, 291.0, 147.0],
                    [150.0, 303.0, 459.0, 468.0, 477.0, 321.0, 162.0],
                    [105.0, 212.0, 321.0, 327.0, 333.0, 224.0, 113.0],
                    [55.0, 111.0, 168.0, 171.0, 174.0, 117.0, 59.0],
                ],
            ],
            [
                [
                    [0.0, 1.0, 3.0, 6.0, 9.0, 7.0, 4.0],
                    [5.0, 12.0, 21.0, 27.0, 33.0, 24.0, 13.0],
                    [15.0, 33.0, 54.0, 63.0, 72.0, 51.0, 27.0],
                    [30.0, 63.0, 99.0, 108.0, 117.0, 81.0, 42.0],
                    [25.0, 52.0, 81.0, 87.0, 93.0, 64.0, 33.0],
                    [15.0, 31.0, 48.0, 51.0, 54.0, 37.0, 19.0],
                ],
                [
                    [20.0, 42.0, 66.0, 72.0, 78.0, 54.0, 28.0],
                    [50.0, 104.0, 162.0, 174.0, 186.0, 128.0, 66.0],
                    [90.0, 186.0, 288.0, 306.0, 324.0, 222.0, 114.0],
                    [120.0, 246.0, 378.0, 396.0, 414.0, 282.0, 144.0],
                    [90.0, 184.0, 282.0, 294.0, 306.0, 208.0, 106.0],
                    [50.0, 102.0, 156.0, 162.0, 168.0, 114.0, 58.0],
                ],
                [
                    [60.0, 123.0, 189.0, 198.0, 207.0, 141.0, 72.0],
                    [135.0, 276.0, 423.0, 441.0, 459.0, 312.0, 159.0],
                    [225.0, 459.0, 702.0, 729.0, 756.0, 513.0, 261.0],
                    [270.0, 549.0, 837.0, 864.0, 891.0, 603.0, 306.0],
                    [195.0, 396.0, 603.0, 621.0, 639.0, 432.0, 219.0],
                    [105.0, 213.0, 324.0, 333.0, 342.0, 231.0, 117.0],
                ],
                [
                    [60.0, 122.0, 186.0, 192.0, 198.0, 134.0, 68.0],
                    [130.0, 264.0, 402.0, 414.0, 426.0, 288.0, 146.0],
                    [210.0, 426.0, 648.0, 666.0, 684.0, 462.0, 234.0],
                    [240.0, 486.0, 738.0, 756.0, 774.0, 522.0, 264.0],
                    [170.0, 344.0, 522.0, 534.0, 546.0, 368.0, 186.0],
                    [90.0, 182.0, 276.0, 282.0, 288.0, 194.0, 98.0],
                ],
                [
                    [40.0, 81.0, 123.0, 126.0, 129.0, 87.0, 44.0],
                    [85.0, 172.0, 261.0, 267.0, 273.0, 184.0, 93.0],
                    [135.0, 273.0, 414.0, 423.0, 432.0, 291.0, 147.0],
                    [150.0, 303.0, 459.0, 468.0, 477.0, 321.0, 162.0],
                    [105.0, 212.0, 321.0, 327.0, 333.0, 224.0, 113.0],
                    [55.0, 111.0, 168.0, 171.0, 174.0, 117.0, 59.0],
                ],
            ],
        ]
    ]
).astype(np.float32)

expect(node, inputs=[x, W], outputs=[y], name="test_convtranspose_3d")
```

</details>


<details>
<summary>convtranspose_attributes</summary>

```python
x = np.array(
    [[[[0.0, 1.0, 2.0], [3.0, 4.0, 5.0], [6.0, 7.0, 8.0]]]]  # (1, 1, 3, 3)
).astype(np.float32)

W = np.array(
    [
        [
            [[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]],  # (1, 2, 3, 3)
            [[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]],
        ]
    ]
).astype(np.float32)

y = np.array(
    [
        [
            [
                [0.0, 0.0, 1.0, 1.0, 3.0, 2.0, 2.0, 0.0],  # (1, 2, 10, 8)
                [0.0, 0.0, 1.0, 1.0, 3.0, 2.0, 2.0, 0.0],
                [0.0, 0.0, 1.0, 1.0, 3.0, 2.0, 2.0, 0.0],
                [3.0, 3.0, 7.0, 4.0, 9.0, 5.0, 5.0, 0.0],
                [3.0, 3.0, 7.0, 4.0, 9.0, 5.0, 5.0, 0.0],
                [3.0, 3.0, 7.0, 4.0, 9.0, 5.0, 5.0, 0.0],
                [6.0, 6.0, 13.0, 7.0, 15.0, 8.0, 8.0, 0.0],
                [6.0, 6.0, 13.0, 7.0, 15.0, 8.0, 8.0, 0.0],
                [6.0, 6.0, 13.0, 7.0, 15.0, 8.0, 8.0, 0.0],
                [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
            ],
            [
                [0.0, 0.0, 1.0, 1.0, 3.0, 2.0, 2.0, 0.0],
                [0.0, 0.0, 1.0, 1.0, 3.0, 2.0, 2.0, 0.0],
                [0.0, 0.0, 1.0, 1.0, 3.0, 2.0, 2.0, 0.0],
                [3.0, 3.0, 7.0, 4.0, 9.0, 5.0, 5.0, 0.0],
                [3.0, 3.0, 7.0, 4.0, 9.0, 5.0, 5.0, 0.0],
                [3.0, 3.0, 7.0, 4.0, 9.0, 5.0, 5.0, 0.0],
                [6.0, 6.0, 13.0, 7.0, 15.0, 8.0, 8.0, 0.0],
                [6.0, 6.0, 13.0, 7.0, 15.0, 8.0, 8.0, 0.0],
                [6.0, 6.0, 13.0, 7.0, 15.0, 8.0, 8.0, 0.0],
                [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
            ],
        ]
    ]
).astype(np.float32)

node = onnx.helper.make_node(
    "ConvTranspose", ["X", "W"], ["Y"], strides=[3, 2], output_shape=[10, 8]
)
expect(node, inputs=[x, W], outputs=[y], name="test_convtranspose_output_shape")

node = onnx.helper.make_node(
    "ConvTranspose", ["X", "W"], ["Y"], strides=[3, 2], output_padding=[1, 1]
)
expect(node, inputs=[x, W], outputs=[y], name="test_convtranspose_pad")

node = onnx.helper.make_node(
    "ConvTranspose",
    ["X", "W"],
    ["Y"],
    name="test",
    strides=[3, 2],
    output_shape=[10, 8],
    kernel_shape=[3, 3],
    output_padding=[1, 1],
)
expect(node, inputs=[x, W], outputs=[y], name="test_convtranspose_kernel_shape")
```

</details>


<details>
<summary>convtranspose_autopad_same</summary>

```python
x = np.array(
    [[[[0.0, 1.0, 2.0], [3.0, 4.0, 5.0], [6.0, 7.0, 8.0]]]]  # (1, 1, 3, 3)
).astype(np.float32)

W = np.array(
    [
        [
            [[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]],  # (1, 2, 3, 3)
            [[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]],
        ]
    ]
).astype(np.float32)

node = onnx.helper.make_node(
    "ConvTranspose", ["X", "W"], ["Y"], auto_pad="SAME_UPPER", strides=[2, 2]
)

y = np.array(
    [
        [
            [
                [0.0, 0.0, 1.0, 1.0, 3.0, 2.0],
                [0.0, 0.0, 1.0, 1.0, 3.0, 2.0],
                [3.0, 3.0, 8.0, 5.0, 12.0, 7.0],
                [3.0, 3.0, 7.0, 4.0, 9.0, 5.0],
                [9.0, 9.0, 20.0, 11.0, 24.0, 13.0],
                [6.0, 6.0, 13.0, 7.0, 15.0, 8.0],
            ],
            [
                [0.0, 0.0, 1.0, 1.0, 3.0, 2.0],
                [0.0, 0.0, 1.0, 1.0, 3.0, 2.0],
                [3.0, 3.0, 8.0, 5.0, 12.0, 7.0],
                [3.0, 3.0, 7.0, 4.0, 9.0, 5.0],
                [9.0, 9.0, 20.0, 11.0, 24.0, 13.0],
                [6.0, 6.0, 13.0, 7.0, 15.0, 8.0],
            ],
        ]
    ]
).astype(np.float32)

expect(node, inputs=[x, W], outputs=[y], name="test_convtranspose_autopad_same")
```

</details>


<details>
<summary>convtranspose_dilations</summary>

```python
x = np.array(
    [[[[3.0, 8.0, 1.0], [9.0, 5.0, 7.0], [3.0, 2.0, 6.0]]]]  # (1, 1, 3, 3)
).astype(np.float32)
W = np.array([[[[7.0, 2.0], [1.0, 9.0]]]]).astype(np.float32)  # (1, 1, 2, 2)

node = onnx.helper.make_node(
    "ConvTranspose", ["X", "W"], ["Y"], dilations=[2, 2]
)

y = np.array(
    [
        [
            [
                [21.0, 56.0, 13.0, 16.0, 2.0],  # [1, 1, 5, 5]
                [63.0, 35.0, 67.0, 10.0, 14.0],
                [24.0, 22.0, 76.0, 76.0, 21.0],
                [9.0, 5.0, 88.0, 45.0, 63.0],
                [3.0, 2.0, 33.0, 18.0, 54.0],
            ]
        ]
    ]
).astype(np.float32)

expect(node, inputs=[x, W], outputs=[y], name="test_convtranspose_dilations")
```

</details>


<details>
<summary>convtranspose_pads</summary>

```python
x = np.array(
    [[[[0.0, 1.0, 2.0], [3.0, 4.0, 5.0], [6.0, 7.0, 8.0]]]]  # (1, 1, 3, 3)
).astype(np.float32)

W = np.array(
    [
        [
            [[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]],  # (1, 2, 3, 3)
            [[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]],
        ]
    ]
).astype(np.float32)

node = onnx.helper.make_node(
    "ConvTranspose", ["X", "W"], ["Y"], strides=[3, 2], pads=[1, 2, 1, 2]
)

y = np.array(
    [
        [
            [
                [1.0, 1.0, 3.0],  # (1, 2, 7, 3)
                [1.0, 1.0, 3.0],
                [7.0, 4.0, 9.0],
                [7.0, 4.0, 9.0],
                [7.0, 4.0, 9.0],
                [13.0, 7.0, 15.0],
                [13.0, 7.0, 15.0],
            ],
            [
                [1.0, 1.0, 3.0],
                [1.0, 1.0, 3.0],
                [7.0, 4.0, 9.0],
                [7.0, 4.0, 9.0],
                [7.0, 4.0, 9.0],
                [13.0, 7.0, 15.0],
                [13.0, 7.0, 15.0],
            ],
        ]
    ]
).astype(np.float32)

expect(node, inputs=[x, W], outputs=[y], name="test_convtranspose_pads")
```

</details>


### <a name="Cos"></a><a name="cos">**Cos**</a>

  Calculates the cosine of the given input tensor, element-wise.

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>The cosine of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>cos</summary>

```python
node = onnx.helper.make_node(
    "Cos",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([-1, 0, 1]).astype(np.float32)
y = np.cos(x)
expect(node, inputs=[x], outputs=[y], name="test_cos_example")

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.cos(x)
expect(node, inputs=[x], outputs=[y], name="test_cos")
```

</details>


### <a name="Cosh"></a><a name="cosh">**Cosh**</a>

  Calculates the hyperbolic cosine of the given input tensor element-wise.

#### Version

This version of the operator has been available since version 9 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>The hyperbolic cosine values of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>cosh</summary>

```python
node = onnx.helper.make_node(
    "Cosh",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([-1, 0, 1]).astype(np.float32)
y = np.cosh(x)  # expected output [1.54308069,  1.,  1.54308069]
expect(node, inputs=[x], outputs=[y], name="test_cosh_example")

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.cosh(x)
expect(node, inputs=[x], outputs=[y], name="test_cosh")
```

</details>


### <a name="CumSum"></a><a name="cumsum">**CumSum**</a>

  Performs cumulative sum of the input elements along the given axis.
  By default, it will do the sum inclusively meaning the first element is copied as is.
  Through an `exclusive` attribute, this behavior can change to exclude the first element.
  It can also perform summation in the opposite direction of the axis. For that, set `reverse` attribute to 1.

  Example:
  ```
  input_x = [1, 2, 3]
  axis=0
  output = [1, 3, 6]
  exclusive=1
  output = [0, 1, 3]
  exclusive=0
  reverse=1
  output = [6, 5, 3]
  exclusive=1
  reverse=1
  output = [5, 3, 0]
  ```


#### Version

This version of the operator has been available since version 14 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#CumSum-11">11</a>

#### Attributes

<dl>
<dt><tt>exclusive</tt> : int (default is 0)</dt>
<dd>If set to 1 will return exclusive sum in which the top element is not included. In other terms, if set to 1, the j-th output element would be the sum of the first (j-1) elements. Otherwise, it would be the sum of the first j elements.</dd>
<dt><tt>reverse</tt> : int (default is 0)</dt>
<dd>If set to 1 will perform the sums in reverse direction.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>x</tt> (differentiable) : T</dt>
<dd>An input tensor that is to be processed.</dd>
<dt><tt>axis</tt> (non-differentiable) : T2</dt>
<dd>A 0-D tensor. Must be in the range [-rank(x), rank(x)-1]. Negative value means counting dimensions from the back.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>y</tt> (differentiable) : T</dt>
<dd>Output tensor of the same type as 'x' with cumulative sums of the x's elements</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to high-precision numeric tensors.</dd>
<dt><tt>T2</tt> : tensor(int32), tensor(int64)</dt>
<dd>axis tensor can be int32 or int64 only</dd>
</dl>


#### Examples

<details>
<summary>cumsum_1d</summary>

```python
node = onnx.helper.make_node("CumSum", inputs=["x", "axis"], outputs=["y"])
x = np.array([1.0, 2.0, 3.0, 4.0, 5.0]).astype(np.float64)
axis = np.int32(0)
y = np.array([1.0, 3.0, 6.0, 10.0, 15.0]).astype(np.float64)
expect(node, inputs=[x, axis], outputs=[y], name="test_cumsum_1d")
```

</details>


<details>
<summary>cumsum_1d_exclusive</summary>

```python
node = onnx.helper.make_node(
    "CumSum", inputs=["x", "axis"], outputs=["y"], exclusive=1
)
x = np.array([1.0, 2.0, 3.0, 4.0, 5.0]).astype(np.float64)
axis = np.int32(0)
y = np.array([0.0, 1.0, 3.0, 6.0, 10.0]).astype(np.float64)
expect(node, inputs=[x, axis], outputs=[y], name="test_cumsum_1d_exclusive")
```

</details>


<details>
<summary>cumsum_1d_reverse</summary>

```python
node = onnx.helper.make_node(
    "CumSum", inputs=["x", "axis"], outputs=["y"], reverse=1
)
x = np.array([1.0, 2.0, 3.0, 4.0, 5.0]).astype(np.float64)
axis = np.int32(0)
y = np.array([15.0, 14.0, 12.0, 9.0, 5.0]).astype(np.float64)
expect(node, inputs=[x, axis], outputs=[y], name="test_cumsum_1d_reverse")
```

</details>


<details>
<summary>cumsum_1d_reverse_exclusive</summary>

```python
node = onnx.helper.make_node(
    "CumSum", inputs=["x", "axis"], outputs=["y"], reverse=1, exclusive=1
)
x = np.array([1.0, 2.0, 3.0, 4.0, 5.0]).astype(np.float64)
axis = np.int32(0)
y = np.array([14.0, 12.0, 9.0, 5.0, 0.0]).astype(np.float64)
expect(
    node, inputs=[x, axis], outputs=[y], name="test_cumsum_1d_reverse_exclusive"
)
```

</details>


<details>
<summary>cumsum_2d_axis_0</summary>

```python
node = onnx.helper.make_node(
    "CumSum",
    inputs=["x", "axis"],
    outputs=["y"],
)
x = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0]).astype(np.float64).reshape((2, 3))
axis = np.int32(0)
y = np.array([1.0, 2.0, 3.0, 5.0, 7.0, 9.0]).astype(np.float64).reshape((2, 3))
expect(node, inputs=[x, axis], outputs=[y], name="test_cumsum_2d_axis_0")
```

</details>


<details>
<summary>cumsum_2d_axis_1</summary>

```python
node = onnx.helper.make_node(
    "CumSum",
    inputs=["x", "axis"],
    outputs=["y"],
)
x = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0]).astype(np.float64).reshape((2, 3))
axis = np.int32(1)
y = np.array([1.0, 3.0, 6.0, 4.0, 9.0, 15.0]).astype(np.float64).reshape((2, 3))
expect(node, inputs=[x, axis], outputs=[y], name="test_cumsum_2d_axis_1")
```

</details>


<details>
<summary>cumsum_2d_negative_axis</summary>

```python
node = onnx.helper.make_node(
    "CumSum",
    inputs=["x", "axis"],
    outputs=["y"],
)
x = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0]).astype(np.float64).reshape((2, 3))
axis = np.int32(-1)
y = np.array([1.0, 3.0, 6.0, 4.0, 9.0, 15.0]).astype(np.float64).reshape((2, 3))
expect(node, inputs=[x, axis], outputs=[y], name="test_cumsum_2d_negative_axis")
```

</details>


### <a name="DFT"></a><a name="dft">**DFT**</a>

  Computes the discrete Fourier Transform (DFT) of the input.

  Assuming the input has shape `[M, N]`, where `N` is the dimension over which the
  DFT is computed and `M` denotes the conceptual "all other dimensions,"
  the DFT `y[m, k]` of shape `[M, N]` is defined as

  $$y[m, k] = \sum_{n=0}^{N-1} e^{-2 \pi j \frac{k n}{N} } x[m, n] ,$$

  and the inverse transform is defined as

  $$x[m, n] = \frac{1}{N} \sum_{k=0}^{N-1} e^{2 \pi j \frac{k n}{N} } y[m, k] ,$$

  where $j$ is the imaginary unit.

  The actual shape of the output is specified in the "output" section.

  Reference: https://docs.scipy.org/doc/scipy/tutorial/fft.html

#### Version

This version of the operator has been available since version 20 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#DFT-17">17</a>

#### Attributes

<dl>
<dt><tt>inverse</tt> : int (default is 0)</dt>
<dd>Whether to perform the inverse discrete Fourier Transform. Default is 0, which corresponds to `false`.</dd>
<dt><tt>onesided</tt> : int (default is 0)</dt>
<dd>If `onesided` is `1` and input is real, only values for `k` in `[0, 1, 2, ..., floor(n_fft/2) + 1]` are returned because the real-to-complex Fourier transform satisfies the conjugate symmetry, i.e., `X[m, k] = X[m, n_fft-k]*`, where `m` denotes "all other dimensions" DFT was not applied on. If the input tensor is complex, onesided output is not possible. Value can be `0` or `1`. Default is `0`.</dd>
</dl>

#### Inputs (1 - 3)

<dl>
<dt><tt>input</tt> (non-differentiable) : T1</dt>
<dd>For real input, the following shape is expected: `[signal_dim0][signal_dim1][signal_dim2]...[signal_dimN][1]`. For complex input, the following shape is expected: `[signal_dim0][signal_dim1][signal_dim2]...[signal_dimN][2]`. The final dimension represents the real and imaginary parts of the value in that order.</dd>
<dt><tt>dft_length</tt> (optional, non-differentiable) : T2</dt>
<dd>The length of the signal as a scalar. If greater than the axis dimension, the signal will be zero-padded up to `dft_length`. If less than the axis dimension, only the first `dft_length` values will be used as the signal. </dd>
<dt><tt>axis</tt> (optional, non-differentiable) : tensor(int64)</dt>
<dd>The axis as a scalar on which to perform the DFT. Default is `-2` (last signal axis). Negative value means counting dimensions from the back. Accepted range is $[-r, -2] \cup [0, r-2]$ where `r = rank(input)`. The last dimension is for representing complex numbers and thus is an invalid axis.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T1</dt>
<dd>The Fourier Transform of the input vector. If `onesided` is `0`, the following shape is expected: `[signal_dim0][signal_dim1][signal_dim2]...[signal_dimN][2]`. If `axis=0` and `onesided` is `1`, the following shape is expected: `[floor(signal_dim0/2)+1][signal_dim1][signal_dim2]...[signal_dimN][2]`. If `axis=1` and `onesided` is `1`, the following shape is expected: `[signal_dim0][floor(signal_dim1/2)+1][signal_dim2]...[signal_dimN][2]`. If `axis=N` and `onesided` is `1`, the following shape is expected: `[signal_dim0][signal_dim1][signal_dim2]...[floor(signal_dimN/2)+1][2]`. The `signal_dim` at the specified `axis` is equal to the `dft_length`.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(bfloat16), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
<dt><tt>T2</tt> : tensor(int32), tensor(int64)</dt>
<dd>Constrain scalar length types to integers.</dd>
</dl>


#### Examples

<details>
<summary>dft</summary>

```python
node = onnx.helper.make_node("DFT", inputs=["x", "", "axis"], outputs=["y"])
x = np.arange(0, 100).reshape(10, 10).astype(np.float32)
axis = np.array(1, dtype=np.int64)
y = np.fft.fft(x, axis=0)

x = x.reshape(1, 10, 10, 1)
y = np.stack((y.real, y.imag), axis=2).astype(np.float32).reshape(1, 10, 10, 2)
expect(node, inputs=[x, axis], outputs=[y], name="test_dft")

node = onnx.helper.make_node("DFT", inputs=["x", "", "axis"], outputs=["y"])
x = np.arange(0, 100).reshape(10, 10).astype(np.float32)
axis = np.array(2, dtype=np.int64)
y = np.fft.fft(x, axis=1)

x = x.reshape(1, 10, 10, 1)
y = np.stack((y.real, y.imag), axis=2).astype(np.float32).reshape(1, 10, 10, 2)
expect(node, inputs=[x, axis], outputs=[y], name="test_dft_axis")

node = onnx.helper.make_node(
    "DFT", inputs=["x", "", "axis"], outputs=["y"], inverse=1
)
x = np.arange(0, 100, dtype=np.complex64).reshape(10, 10)
axis = np.array(1, dtype=np.int64)
y = np.fft.ifft(x, axis=0)

x = np.stack((x.real, x.imag), axis=2).astype(np.float32).reshape(1, 10, 10, 2)
y = np.stack((y.real, y.imag), axis=2).astype(np.float32).reshape(1, 10, 10, 2)
expect(node, inputs=[x, axis], outputs=[y], name="test_dft_inverse")
```

</details>


<details>
<summary>opset19</summary>

```python
node = onnx.helper.make_node("DFT", inputs=["x"], outputs=["y"], axis=1)
x = np.arange(0, 100).reshape(10, 10).astype(np.float32)
y = np.fft.fft(x, axis=0)

x = x.reshape(1, 10, 10, 1)
y = np.stack((y.real, y.imag), axis=2).astype(np.float32).reshape(1, 10, 10, 2)
expect(
    node,
    inputs=[x],
    outputs=[y],
    name="test_dft_opset19",
    opset_imports=[onnx.helper.make_opsetid("", 19)],
)

node = onnx.helper.make_node("DFT", inputs=["x"], outputs=["y"], axis=2)
x = np.arange(0, 100).reshape(10, 10).astype(np.float32)
y = np.fft.fft(x, axis=1)

x = x.reshape(1, 10, 10, 1)
y = np.stack((y.real, y.imag), axis=2).astype(np.float32).reshape(1, 10, 10, 2)
expect(
    node,
    inputs=[x],
    outputs=[y],
    name="test_dft_axis_opset19",
    opset_imports=[onnx.helper.make_opsetid("", 19)],
)

node = onnx.helper.make_node(
    "DFT", inputs=["x"], outputs=["y"], inverse=1, axis=1
)
x = np.arange(0, 100, dtype=np.complex64).reshape(
    10,
    10,
)
y = np.fft.ifft(x, axis=0)

x = np.stack((x.real, x.imag), axis=2).astype(np.float32).reshape(1, 10, 10, 2)
y = np.stack((y.real, y.imag), axis=2).astype(np.float32).reshape(1, 10, 10, 2)
expect(
    node,
    inputs=[x],
    outputs=[y],
    name="test_dft_inverse_opset19",
    opset_imports=[onnx.helper.make_opsetid("", 19)],
)
```

</details>


### <a name="DeformConv"></a><a name="deformconv">**DeformConv**</a>

  Performs deformable convolution as described in https://arxiv.org/abs/1703.06211 and https://arxiv.org/abs/1811.11168.
  This operator specification supports the general N-D case. Note that most common use cases have 2D or 3D data.

#### Version

This version of the operator has been available since version 19 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>dilations</tt> : list of ints</dt>
<dd>Dilation value along each spatial axis of the kernel. Default is 1 along each axis.</dd>
<dt><tt>group</tt> : int (default is 1)</dt>
<dd>Number of groups the input and output channels, C and oC, are divided into. C and oC must both be divisible by group. Default is 1.</dd>
<dt><tt>kernel_shape</tt> : list of ints</dt>
<dd>Shape of the convolution kernel. If not present, it is inferred from the shape of input W.</dd>
<dt><tt>offset_group</tt> : int (default is 1)</dt>
<dd>Number of groups of offset. C must be divisible by offset_group. Default is 1.</dd>
<dt><tt>pads</tt> : list of ints</dt>
<dd>Padding for the beginning and end along each spatial axis. The values represent the number of pixels added to the beginning and end of the corresponding axis and can take any nonnegative value. The format should be as follows: [x1_begin, x2_begin, ..., x1_end, x2_end, ...], where xi_begin is the number of pixels added at the beginning of axis `i` and xi_end is the number of pixels added at the end of axis `i`. Default is 0 along each axis.</dd>
<dt><tt>strides</tt> : list of ints</dt>
<dd>Stride along each spatial axis. Default is 1 along each axis.</dd>
</dl>

#### Inputs (3 - 5)

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Input data tensor. For 2D image data, it has shape (N, C, H, W) where N is the batch size, C is the number of input channels, and H and W are the height and width. In general, the shape is (N, C, D1, D2, ... , Dn) for n-dimensional data, where D1 to Dn are the spatial dimension sizes. Most common use cases have n = 2 or 3.</dd>
<dt><tt>W</tt> : T</dt>
<dd>Weight tensor that will be used in the convolutions. It has shape (oC, C/group, kH, kW), where oC is the number of output channels and kH and kW are the kernel height and width. For more than 2 dimensions, it has shape (oC, C/group, k1, k2, ... , kn).</dd>
<dt><tt>offset</tt> : T</dt>
<dd>Offset tensor denoting the offset for the sampling locations in the convolution kernel. It has shape (N, offset_group * kH * kW * 2, oH, oW) for 2D data or (N, offset_group * k1 * k2 * ... * kn * n, o1, o2, ... , on) for nD data. Use linear interpolationfor fractional offset values. Sampling locations outside of the padded input tensor gives zero.</dd>
<dt><tt>B</tt> (optional) : T</dt>
<dd>Optional 1D bias of length oC to be added to the convolution. Default is a tensor of zeros.</dd>
<dt><tt>mask</tt> (optional) : T</dt>
<dd>The mask tensor to be applied to each position in the convolution kernel. It has shape (N, offset_group * kH * kW, oH, oW) for 2D data or (N, offset_group * k1 * k2 * ... * kn * n, o1, o2, ... , on) for nD data. Default is a tensor of ones.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Output data tensor that contains the result of convolution. It has shape (N, oC, oH, oW) for 2D data or (N, oC, o1, o2, ..., on) for nD data</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>deformconv</summary>

```python
X = np.arange(9).astype(np.float32)
X.shape = (1, 1, 3, 3)
W = np.ones((1, 1, 2, 2), dtype=np.float32)

# Convolution with padding
offset_with_padding = np.zeros((1, 8, 4, 4), dtype=np.float32)
offset_with_padding[
    0, 0, 0, 0
] = 0.5  # h-coord of [0, 0] element of kernel, at output position [0, 0]
offset_with_padding[
    0, 5, 1, 2
] = -0.1  # w-coord of [1, 0] element of kernel, at output position [1, 2]

node_with_padding = onnx.helper.make_node(
    "DeformConv",
    inputs=["X", "W", "offset_with_padding"],
    outputs=["Y_with_padding"],
    kernel_shape=[2, 2],
    pads=[1, 1, 1, 1],
)
Y_with_padding = np.array(
    [
        [
            [
                [0.0, 1.0, 3.0, 2.0],  # (1, 1, 4, 4) output tensor
                [3.0, 8.0, 11.9, 7.0],
                [9.0, 20.0, 24.0, 13.0],
                [6.0, 13.0, 15.0, 8.0],
            ]
        ]
    ]
).astype(np.float32)
expect(
    node_with_padding,
    inputs=[X, W, offset_with_padding],
    outputs=[Y_with_padding],
    name="test_basic_deform_conv_with_padding",
)

# Convolution without padding
offset_without_padding = np.zeros((1, 8, 2, 2), dtype=np.float32)
offset_without_padding[
    0, 0, 0, 0
] = 0.5  # h-coord of [0, 0] element of kernel, at output position [0, 0]
offset_without_padding[
    0, 5, 0, 1
] = -0.1  # w-coord of [1, 0] element of kernel, at output position [0, 1]

node_without_padding = onnx.helper.make_node(
    "DeformConv",
    inputs=["X", "W", "offset_without_padding"],
    outputs=["Y_without_padding"],
    kernel_shape=[2, 2],
    pads=[0, 0, 0, 0],
)
Y_without_padding = np.array(
    [
        [
            [
                [9.5, 11.9],  # (1, 1, 2, 2) output tensor
                [20.0, 24.0],
            ]
        ]
    ]
).astype(np.float32)
expect(
    node_without_padding,
    inputs=[X, W, offset_without_padding],
    outputs=[Y_without_padding],
    name="test_basic_deform_conv_without_padding",
)
```

</details>


<details>
<summary>deformconv_with_mask_bias</summary>

```python
X = np.arange(9).astype(np.float32)
X.shape = (1, 1, 3, 3)
W = np.ones((1, 1, 2, 2), dtype=np.float32)
B = np.ones((1,), dtype=np.float32)

offset = np.zeros((1, 8, 2, 2), dtype=np.float32)
offset[
    0, 0, 0, 0
] = 0.5  # h-coord of [0, 0] element of kernel, at output position [0, 0]
offset[
    0, 5, 0, 1
] = -0.1  # w-coord of [1, 0] element of kernel, at output position [0, 1]

mask = np.ones((1, 4, 2, 2), dtype=np.float32)
mask[0, 2, 1, 1] = 0.2  # [1, 0] element of kernel at output position [1, 1]

node = onnx.helper.make_node(
    "DeformConv",
    inputs=["X", "W", "offset", "B", "mask"],
    outputs=["Y"],
    kernel_shape=[2, 2],
    pads=[0, 0, 0, 0],
)
Y = np.array(
    [
        [
            [
                [10.5, 12.9],  # (1, 1, 2, 2) output tensor
                [21.0, 19.4],
            ]
        ]
    ]
).astype(np.float32)
expect(
    node,
    inputs=[X, W, offset, B, mask],
    outputs=[Y],
    name="test_deform_conv_with_mask_bias",
)
```

</details>


<details>
<summary>deformconv_with_multiple_offset_groups</summary>

```python
X = np.zeros((1, 2, 3, 3), dtype=np.float32)
X[0, 0] = np.reshape(np.arange(9).astype(np.float32), (3, 3))
X[0, 1] = np.reshape(np.arange(8, -1, -1).astype(np.float32), (3, 3))
X.shape = (1, 2, 3, 3)
W = np.ones((1, 2, 2, 2), dtype=np.float32)

offset = np.zeros((1, 16, 2, 2), dtype=np.float32)
offset[
    0, 0, 0, 0
] = 0.5  # h-coord of [0, 0] element of kernel in channel 0, at output position [0, 0]
offset[
    0, 13, 0, 1
] = (
    -0.1
)  # w-coord of [1, 0] element of kernel in channel 1, at output position [0, 1]

node = onnx.helper.make_node(
    "DeformConv",
    inputs=["X", "W", "offset"],
    outputs=["Y"],
    kernel_shape=[2, 2],
    pads=[0, 0, 0, 0],
    offset_group=2,
)
Y = np.array(
    [
        [
            [
                [33.5, 32.1],  # (1, 1, 2, 2) output tensor
                [32.0, 32.0],
            ]
        ]
    ]
).astype(np.float32)
expect(
    node,
    inputs=[X, W, offset],
    outputs=[Y],
    name="test_deform_conv_with_multiple_offset_groups",
)
```

</details>


### <a name="DepthToSpace"></a><a name="depthtospace">**DepthToSpace**</a>

  DepthToSpace rearranges (permutes) data from depth into blocks of spatial data.
  This is the reverse transformation of SpaceToDepth. More specifically, this op outputs a copy of
  the input tensor where values from the depth dimension are moved in spatial blocks to the height
  and width dimensions. By default, `mode` = `DCR`.
  In the DCR mode, elements along the depth dimension from the input tensor are rearranged in the
  following order: depth, column, and then row. The output y is computed from the input x as below:

  ```
  b, c, h, w = x.shape
  tmp = np.reshape(x, [b, blocksize, blocksize, c // (blocksize**2), h, w])
  tmp = np.transpose(tmp, [0, 3, 4, 1, 5, 2])
  y = np.reshape(tmp, [b, c // (blocksize**2), h * blocksize, w * blocksize])
  ```

  In the CRD mode, elements along the depth dimension from the input tensor are rearranged in the
  following order: column, row, and the depth. The output y is computed from the input x as below:

  ```
  b, c, h, w = x.shape
  tmp = np.reshape(x, [b, c // (blocksize ** 2), blocksize, blocksize, h, w])
  tmp = np.transpose(tmp, [0, 1, 4, 2, 5, 3])
  y = np.reshape(tmp, [b, c // (blocksize ** 2), h * blocksize, w * blocksize])
  ```

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#DepthToSpace-1">1</a>, <a href="Changelog.md#DepthToSpace-11">11</a>

#### Attributes

<dl>
<dt><tt>blocksize</tt> : int (required)</dt>
<dd>Blocks of [blocksize, blocksize] are moved.</dd>
<dt><tt>mode</tt> : string (default is DCR)</dt>
<dd>DCR (default) for depth-column-row order re-arrangement. Use CRD for column-row-depth order.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>Input tensor of [N,C,H,W], where N is the batch axis, C is the channel or depth, H is the height and W is the width.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>Output tensor of [N, C/(blocksize * blocksize), H * blocksize, W * blocksize].</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain input and output types to all tensor types.</dd>
</dl>


#### Examples

<details>
<summary>crd_mode_example</summary>

```python
node = onnx.helper.make_node(
    "DepthToSpace", inputs=["x"], outputs=["y"], blocksize=2, mode="CRD"
)

# (1, 8, 2, 3) input tensor
x = np.array(
    [
        [
            [[0.0, 1.0, 2.0], [3.0, 4.0, 5.0]],
            [[9.0, 10.0, 11.0], [12.0, 13.0, 14.0]],
            [[18.0, 19.0, 20.0], [21.0, 22.0, 23.0]],
            [[27.0, 28.0, 29.0], [30.0, 31.0, 32.0]],
            [[36.0, 37.0, 38.0], [39.0, 40.0, 41.0]],
            [[45.0, 46.0, 47.0], [48.0, 49.0, 50.0]],
            [[54.0, 55.0, 56.0], [57.0, 58.0, 59.0]],
            [[63.0, 64.0, 65.0], [66.0, 67.0, 68.0]],
        ]
    ]
).astype(np.float32)

# (1, 2, 4, 6) output tensor
y = np.array(
    [
        [
            [
                [0.0, 9.0, 1.0, 10.0, 2.0, 11.0],
                [18.0, 27.0, 19.0, 28.0, 20.0, 29.0],
                [3.0, 12.0, 4.0, 13.0, 5.0, 14.0],
                [21.0, 30.0, 22.0, 31.0, 23.0, 32.0],
            ],
            [
                [36.0, 45.0, 37.0, 46.0, 38.0, 47.0],
                [54.0, 63.0, 55.0, 64.0, 56.0, 65.0],
                [39.0, 48.0, 40.0, 49.0, 41.0, 50.0],
                [57.0, 66.0, 58.0, 67.0, 59.0, 68.0],
            ],
        ]
    ]
).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_depthtospace_crd_mode_example")
```

</details>


<details>
<summary>default_mode_example</summary>

```python
node = onnx.helper.make_node(
    "DepthToSpace", inputs=["x"], outputs=["y"], blocksize=2, mode="DCR"
)

# (1, 8, 2, 3) input tensor
x = np.array(
    [
        [
            [[0.0, 1.0, 2.0], [3.0, 4.0, 5.0]],
            [[9.0, 10.0, 11.0], [12.0, 13.0, 14.0]],
            [[18.0, 19.0, 20.0], [21.0, 22.0, 23.0]],
            [[27.0, 28.0, 29.0], [30.0, 31.0, 32.0]],
            [[36.0, 37.0, 38.0], [39.0, 40.0, 41.0]],
            [[45.0, 46.0, 47.0], [48.0, 49.0, 50.0]],
            [[54.0, 55.0, 56.0], [57.0, 58.0, 59.0]],
            [[63.0, 64.0, 65.0], [66.0, 67.0, 68.0]],
        ]
    ]
).astype(np.float32)

# (1, 2, 4, 6) output tensor
y = np.array(
    [
        [
            [
                [0.0, 18.0, 1.0, 19.0, 2.0, 20.0],
                [36.0, 54.0, 37.0, 55.0, 38.0, 56.0],
                [3.0, 21.0, 4.0, 22.0, 5.0, 23.0],
                [39.0, 57.0, 40.0, 58.0, 41.0, 59.0],
            ],
            [
                [9.0, 27.0, 10.0, 28.0, 11.0, 29.0],
                [45.0, 63.0, 46.0, 64.0, 47.0, 65.0],
                [12.0, 30.0, 13.0, 31.0, 14.0, 32.0],
                [48.0, 66.0, 49.0, 67.0, 50.0, 68.0],
            ],
        ]
    ]
).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_depthtospace_example")
```

</details>


### <a name="DequantizeLinear"></a><a name="dequantizelinear">**DequantizeLinear**</a>

  The linear dequantization operator. It consumes a quantized tensor, a scale, and a zero point to compute the
  full-precision tensor. The dequantization formula is `y = (x - x_zero_point) * x_scale`. `x_scale` and `x_zero_point`
  must have the same shape, determining the quantization's granularity: a scalar for per-tensor/per-layer quantization,
  a 1-D tensor for per-axis quantization, or have a rank identical to the input for blocked quantization.
  See QuantizeLinear for details on quantization granularity.

  `x_zero_point` and `x` must have the same type. `x` and `y` must have the same shape. In the case of dequantizing
  `int32`, there's no zero point (zero point is supposed to be 0).
  `zero-point` is usually not used in the case of float8 types quantization, but the dequantization formula remains the same
  for consistency, and `x_scale` still determines the output type.

#### Version

This version of the operator has been available since version 21 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#DequantizeLinear-10">10</a>, <a href="Changelog.md#DequantizeLinear-13">13</a>, <a href="Changelog.md#DequantizeLinear-19">19</a>

#### Attributes

<dl>
<dt><tt>axis</tt> : int (default is 1)</dt>
<dd>(Optional) The axis of the dequantizing dimension of the input tensor. Used for per-axis and blocked quantization. Negative value means counting dimensions from the back. Accepted range is `[-r, r-1]` where `r = rank(input)`.</dd>
<dt><tt>block_size</tt> : int (default is 0)</dt>
<dd>(Optional) The size of the quantization block (number of times every scale is replicated). Used only for blocked quantization. The block size is a positive integer. Given `x` shape `(D0, ..., Di, ..., Dn)`, `y_scale` shape `(S0, ... Si, ...Sn)` and `axis=i`, the accepted range is `[ceil(Di/Si), ceil(Di/(Si-1))-1]`</dd>
</dl>

#### Inputs (2 - 3)

<dl>
<dt><tt>x</tt> : T1</dt>
<dd>N-D quantized input tensor to be de-quantized.</dd>
<dt><tt>x_scale</tt> : T2</dt>
<dd>Scale for input `x`. For per-tensor/layer dequantization the scale is a scalar, for per per-axis dequantization it is a 1-D Tensor and for blocked dequantization it has the same shape as the input, except for one dimension in which blocking is performed.</dd>
<dt><tt>x_zero_point</tt> (optional) : T1</dt>
<dd>Zero point for input `x`. Shape must match x_scale. It's optional. Zero point is 0 when it's not specified.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>y</tt> : T2</dt>
<dd>N-D full precision output tensor. It has same shape as input `x`.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(int8), tensor(uint8), tensor(int16), tensor(uint16), tensor(int32), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4)</dt>
<dd>The type of the inputs 'x_zero_point' and 'x'.</dd>
<dt><tt>T2</tt> : tensor(float), tensor(float16), tensor(bfloat16)</dt>
<dd>'x_scale' determines the output type.</dd>
</dl>


#### Examples

<details>
<summary>axis</summary>

```python
node = onnx.helper.make_node(
    "DequantizeLinear",
    inputs=["x", "x_scale", "x_zero_point"],
    outputs=["y"],
)

# 1-D tensor zero point and scale of size equal to axis 1 of the input tensor
x = np.array(
    [
        [
            [[3, 89], [34, 200], [74, 59]],
            [[5, 24], [24, 87], [32, 13]],
            [[245, 99], [4, 142], [121, 102]],
        ],
    ],
    dtype=np.uint8,
)
x_scale = np.array([2, 4, 5], dtype=np.float32)
x_zero_point = np.array([84, 24, 196], dtype=np.uint8)
y = (
    x.astype(np.float32) - x_zero_point.reshape(1, 3, 1, 1).astype(np.float32)
) * x_scale.reshape(1, 3, 1, 1)

expect(
    node,
    inputs=[x, x_scale, x_zero_point],
    outputs=[y],
    name="test_dequantizelinear_axis",
)
```

</details>


<details>
<summary>blocked</summary>

```python
node = onnx.helper.make_node(
    "DequantizeLinear",
    inputs=["x", "x_scale", "x_zero_point"],
    outputs=["y"],
    axis=1,
    block_size=2,
)

x = np.array(
    [
        [
            [[3, 89], [34, 200], [74, 59]],
            [[5, 24], [24, 87], [32, 13]],
            [[5, 12], [12, 33], [65, 42]],
            [[245, 99], [4, 142], [121, 102]],
        ],
    ],
    dtype=np.uint8,
)

x_scale = np.array(
    [
        [
            [[3.0, 2.0], [4.0, 1.0], [2.0, 2.0]],
            [[5.0, 2.0], [4.0, 3.0], [5.0, 2.0]],
        ],
    ],
    dtype=np.float32,
)
x_zero_point = np.array(
    [
        [
            [[1, 0], [0, 1], [2, 20]],
            [[3, 2], [4, 3], [15, 2]],
        ],
    ],
    dtype=np.uint8,
)

# x.shape = (1, 4, 3, 2)
# x_scale.shape = (1, 2, 3, 2)
assert x_scale.shape == x_zero_point.shape
block_axis = 1
# The block shape is [x.shape[i] // x_scale.shape[i] for i in range(len(x.shape))] = (1, 2, 1, 1)
assert all(
    x.shape[i] == x_scale.shape[i]
    for i in range(len(x.shape))
    if i != block_axis
)
assert x.shape[block_axis] % x_scale.shape[block_axis] == 0
repeats = x.shape[block_axis] // x_scale.shape[block_axis]

# Create element-wise scale and zero point
x_scale_elementwise = np.repeat(x_scale, repeats=repeats, axis=block_axis)
x_zero_point_elementwise = np.repeat(
    x_zero_point, repeats=repeats, axis=block_axis
)

y = (
    x.astype(np.float32) - x_zero_point_elementwise.astype(np.float32)
) * x_scale_elementwise

expect(
    node,
    inputs=[x, x_scale, x_zero_point],
    outputs=[y],
    name="test_dequantizelinear_blocked",
)
```

</details>


<details>
<summary>dequantizelinear</summary>

```python
node = onnx.helper.make_node(
    "DequantizeLinear",
    inputs=["x", "x_scale", "x_zero_point"],
    outputs=["y"],
)

# scalar zero point and scale
x = np.array([0, 3, 128, 255]).astype(np.uint8)
x_scale = np.float32(2)
x_zero_point = np.uint8(128)
y = np.array([-256, -250, 0, 254], dtype=np.float32)

expect(
    node,
    inputs=[x, x_scale, x_zero_point],
    outputs=[y],
    name="test_dequantizelinear",
)
```

</details>


<details>
<summary>e4m3fn</summary>

```python
node = onnx.helper.make_node(
    "DequantizeLinear",
    inputs=["x", "x_scale"],
    outputs=["y"],
    axis=0,
)

# scalar zero point and scale
x = make_tensor("x", TensorProto.FLOAT8E4M3FN, [5], [0, 0.5, 1, 448, -104])
x_scale = np.float32(2)
y = np.array([0.0, 1.0, 2.0, 896.0, -208.0], dtype=np.float32)

expect(
    node,
    inputs=[x, x_scale],
    outputs=[y],
    name="test_dequantizelinear_e4m3fn",
)
```

</details>


<details>
<summary>e4m3fn_float16</summary>

```python
node = onnx.helper.make_node(
    "DequantizeLinear",
    inputs=["x", "x_scale"],
    outputs=["y"],
    axis=0,
)

# scalar zero point and scale
x = make_tensor("x", TensorProto.FLOAT8E4M3FN, [5], [0, 0.5, 1, 448, -104])
x_scale = np.float16(2)
y = np.array([0.0, 1.0, 2.0, 896.0, -208.0], dtype=np.float16)

expect(
    node,
    inputs=[x, x_scale],
    outputs=[y],
    name="test_dequantizelinear_e4m3fn_float16",
)
```

</details>


<details>
<summary>e4m3fn_zero_point</summary>

```python
node = onnx.helper.make_node(
    "DequantizeLinear",
    inputs=["x", "x_scale", "zero_point"],
    outputs=["y"],
    axis=0,
)

# scalar zero point and scale
x = make_tensor("x", TensorProto.FLOAT8E4M3FN, [5], [0, 0.5, 1, 448, -104])
zero_point = make_tensor("zero_point", TensorProto.FLOAT8E4M3FN, [1], [0])
x_scale = np.float32(2)
y = np.array([0.0, 1.0, 2.0, 896.0, -208.0], dtype=np.float32)

expect(
    node,
    inputs=[x, x_scale, zero_point],
    outputs=[y],
    name="test_dequantizelinear_e4m3fn_zero_point",
)
```

</details>


<details>
<summary>e5m2</summary>

```python
node = onnx.helper.make_node(
    "DequantizeLinear",
    inputs=["x", "x_scale"],
    outputs=["y"],
    axis=0,
)

# scalar zero point and scale
x = make_tensor("x", TensorProto.FLOAT8E5M2, [5], [0, 0.5, 1, 49152, -96])
x_scale = np.float32(2)
y = np.array([0.0, 1.0, 2.0, 98304.0, -192.0], dtype=np.float32)

expect(
    node,
    inputs=[x, x_scale],
    outputs=[y],
    name="test_dequantizelinear_e5m2",
)
```

</details>


<details>
<summary>int16</summary>

```python
node = onnx.helper.make_node(
    "DequantizeLinear",
    inputs=["x", "x_scale", "x_zero_point"],
    outputs=["y"],
)

x = np.array([-300, -30, -1025, 1270]).astype(np.int16)
x_scale = np.float32(2)
x_zero_point = np.int16(-1024)
y = np.array([1448.0, 1988.0, -2.0, 4588.0], dtype=np.float32)

expect(
    node,
    inputs=[x, x_scale, x_zero_point],
    outputs=[y],
    name="test_dequantizelinear_int16",
)
```

</details>


<details>
<summary>int4</summary>

```python
node = onnx.helper.make_node(
    "DequantizeLinear",
    inputs=["x", "x_scale", "x_zero_point"],
    outputs=["y"],
    axis=0,
)

# scalar zero point and scale
x = make_tensor("x", TensorProto.INT4, [5], [0, 1, 7, -4, -8])
x_scale = np.float32(2)
x_zero_point = make_tensor("x_zero_point", TensorProto.INT4, (1,), [1])
y = np.array([-2, 0, 12, -10, -18], dtype=np.float32)

expect(
    node,
    inputs=[x, x_scale, x_zero_point],
    outputs=[y],
    name="test_dequantizelinear_int4",
)
```

</details>


<details>
<summary>uint16</summary>

```python
node = onnx.helper.make_node(
    "DequantizeLinear",
    inputs=["x", "x_scale", "x_zero_point"],
    outputs=["y"],
)

x = np.array([30000, 31000, 32768, 33000]).astype(np.uint16)
x_scale = np.float32(2)
x_zero_point = np.uint16(32767)
y = np.array([-5534.0, -3534.0, 2.0, 466.0], dtype=np.float32)

expect(
    node,
    inputs=[x, x_scale, x_zero_point],
    outputs=[y],
    name="test_dequantizelinear_uint16",
)
```

</details>


<details>
<summary>uint4</summary>

```python
node = onnx.helper.make_node(
    "DequantizeLinear",
    inputs=["x", "x_scale", "x_zero_point"],
    outputs=["y"],
    axis=0,
)

# scalar zero point and scale
x = make_tensor("x", TensorProto.UINT4, [5], [0, 1, 7, 10, 15])
x_scale = np.float32(2)
x_zero_point = make_tensor("x_zero_point", TensorProto.UINT4, (1,), [1])
y = np.array([-2, 0, 12, 18, 28], dtype=np.float32)

expect(
    node,
    inputs=[x, x_scale, x_zero_point],
    outputs=[y],
    name="test_dequantizelinear_uint4",
)
```

</details>


### <a name="Det"></a><a name="det">**Det**</a>

  Det calculates determinant of a square matrix or batches of square matrices.
  Det takes one input tensor of shape `[*, M, M]`, where `*` is zero or more batch dimensions,
  and the inner-most 2 dimensions form square matrices.
  The output is a tensor of shape `[*]`, containing the determinants of all input submatrices.
  e.g., When the input is 2-D, the output is a scalar(shape is empty: `[]`).

#### Version

This version of the operator has been available since version 11 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to floating-point tensors.</dd>
</dl>


#### Examples

<details>
<summary>2d</summary>

```python
node = onnx.helper.make_node(
    "Det",
    inputs=["x"],
    outputs=["y"],
)

x = np.arange(4).reshape(2, 2).astype(np.float32)
y = np.linalg.det(x)  # expect -2
expect(node, inputs=[x], outputs=[y], name="test_det_2d")
```

</details>


<details>
<summary>nd</summary>

```python
node = onnx.helper.make_node(
    "Det",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([[[1, 2], [3, 4]], [[1, 2], [2, 1]], [[1, 3], [3, 1]]]).astype(
    np.float32
)
y = np.linalg.det(x)  # expect array([-2., -3., -8.])
expect(node, inputs=[x], outputs=[y], name="test_det_nd")
```

</details>


### <a name="Div"></a><a name="div">**Div**</a>

  Performs element-wise binary division (with Numpy-style broadcasting support).

  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

  (Opset 14 change): Extend supported types to include uint8, int8, uint16, and int16.

#### Version

This version of the operator has been available since version 14 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Div-1">1</a>, <a href="Changelog.md#Div-6">6</a>, <a href="Changelog.md#Div-7">7</a>, <a href="Changelog.md#Div-13">13</a>

#### Inputs

<dl>
<dt><tt>A</tt> (differentiable) : T</dt>
<dd>First operand.</dd>
<dt><tt>B</tt> (differentiable) : T</dt>
<dd>Second operand.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> (differentiable) : T</dt>
<dd>Result, has same element type as two inputs</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to all numeric tensors.</dd>
</dl>


#### Examples

<details>
<summary>div</summary>

```python
node = onnx.helper.make_node(
    "Div",
    inputs=["x", "y"],
    outputs=["z"],
)

x = np.array([3, 4]).astype(np.float32)
y = np.array([1, 2]).astype(np.float32)
z = x / y  # expected output [3., 2.]
expect(node, inputs=[x, y], outputs=[z], name="test_div_example")

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.rand(3, 4, 5).astype(np.float32) + 1.0
z = x / y
expect(node, inputs=[x, y], outputs=[z], name="test_div")

x = np.random.randint(24, size=(3, 4, 5), dtype=np.uint8)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint8) + 1
z = x // y
expect(node, inputs=[x, y], outputs=[z], name="test_div_uint8")
```

</details>


<details>
<summary>div_broadcast</summary>

```python
node = onnx.helper.make_node(
    "Div",
    inputs=["x", "y"],
    outputs=["z"],
)

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.rand(5).astype(np.float32) + 1.0
z = x / y
expect(node, inputs=[x, y], outputs=[z], name="test_div_bcast")
```

</details>


### <a name="Dropout"></a><a name="dropout">**Dropout**</a>

  Dropout takes an input floating-point tensor, an optional input ratio (floating-point scalar) and an optional input training_mode (boolean scalar). It produces two tensor outputs,
  output (floating-point tensor) and mask (optional `Tensor<bool>`). If `training_mode` is true then the output Y will be a random dropout;
  Note that this Dropout scales the masked input data by the following equation, so to convert the trained model into inference mode,
  the user can simply not pass `training_mode` input or set it to false.
  ```
  output = scale * data * mask,
  ```
  where
  ```
  scale = 1. / (1. - ratio).
  ```
  This operator has **optional** inputs/outputs. See [the doc](IR.md) for more details about the representation of optional arguments. An empty string may be used in the place of an actual argument's name to indicate a missing argument. Trailing optional arguments (those not followed by an argument that is present) may also be simply omitted.

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Dropout-1">1</a>, <a href="Changelog.md#Dropout-6">6</a>, <a href="Changelog.md#Dropout-7">7</a>, <a href="Changelog.md#Dropout-10">10</a>, <a href="Changelog.md#Dropout-12">12</a>

#### Attributes

<dl>
<dt><tt>seed</tt> : int</dt>
<dd>(Optional) Seed to the random generator, if not specified we will auto generate one.</dd>
</dl>

#### Inputs (1 - 3)

<dl>
<dt><tt>data</tt> (differentiable) : T</dt>
<dd>The input data as Tensor.</dd>
<dt><tt>ratio</tt> (optional, non-differentiable) : T1</dt>
<dd>The ratio of random dropout, with value in [0, 1). If this input was not set, or if it was set to 0, the output would be a simple copy of the input. If it's non-zero, output will be a random dropout of the scaled input, which is typically the case during training. It is an optional value, if not specified it will default to 0.5.</dd>
<dt><tt>training_mode</tt> (optional, non-differentiable) : T2</dt>
<dd>If set to true then it indicates dropout is being used for training. It is an optional value hence unless specified explicitly, it is false. If it is false, ratio is ignored and the operation mimics inference mode where nothing will be dropped from the input data and if mask is requested as output it will contain all ones.</dd>
</dl>

#### Outputs (1 - 2)

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>The output.</dd>
<dt><tt>mask</tt> (optional, non-differentiable) : T2</dt>
<dd>The output mask.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to float tensors.</dd>
<dt><tt>T1</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input 'ratio' types to float tensors.</dd>
<dt><tt>T2</tt> : tensor(bool)</dt>
<dd>Constrain output 'mask' types to boolean tensors.</dd>
</dl>


#### Examples

<details>
<summary>default</summary>

```python
seed = np.int64(0)
node = onnx.helper.make_node("Dropout", inputs=["x"], outputs=["y"], seed=seed)

x = np.random.randn(3, 4, 5).astype(np.float32)
y = dropout(x)
expect(node, inputs=[x], outputs=[y], name="test_dropout_default")
```

</details>


<details>
<summary>default_mask</summary>

```python
seed = np.int64(0)
node = onnx.helper.make_node(
    "Dropout", inputs=["x"], outputs=["y", "z"], seed=seed
)

x = np.random.randn(3, 4, 5).astype(np.float32)
y, z = dropout(x, return_mask=True)
expect(node, inputs=[x], outputs=[y, z], name="test_dropout_default_mask")
```

</details>


<details>
<summary>default_mask_ratio</summary>

```python
seed = np.int64(0)
node = onnx.helper.make_node(
    "Dropout", inputs=["x", "r"], outputs=["y", "z"], seed=seed
)

r = np.float32(0.1)
x = np.random.randn(3, 4, 5).astype(np.float32)
y, z = dropout(x, r, return_mask=True)
expect(
    node, inputs=[x, r], outputs=[y, z], name="test_dropout_default_mask_ratio"
)
```

</details>


<details>
<summary>default_old</summary>

```python
node = onnx.helper.make_node(
    "Dropout",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([-1, 0, 1]).astype(np.float32)
y = x
expect(
    node,
    inputs=[x],
    outputs=[y],
    name="test_dropout_default_old",
    opset_imports=[helper.make_opsetid("", 11)],
)
```

</details>


<details>
<summary>default_ratio</summary>

```python
seed = np.int64(0)
node = onnx.helper.make_node(
    "Dropout", inputs=["x", "r"], outputs=["y"], seed=seed
)

r = np.float32(0.1)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = dropout(x, r)
expect(node, inputs=[x, r], outputs=[y], name="test_dropout_default_ratio")
```

</details>


<details>
<summary>random_old</summary>

```python
node = onnx.helper.make_node(
    "Dropout",
    inputs=["x"],
    outputs=["y"],
    ratio=0.2,
)

x = np.random.randn(3, 4, 5).astype(np.float32)
y = x
expect(
    node,
    inputs=[x],
    outputs=[y],
    name="test_dropout_random_old",
    opset_imports=[helper.make_opsetid("", 11)],
)
```

</details>


<details>
<summary>training</summary>

```python
seed = np.int64(0)
node = onnx.helper.make_node(
    "Dropout", inputs=["x", "r", "t"], outputs=["y"], seed=seed
)

x = np.random.randn(3, 4, 5).astype(np.float32)
r = np.float32(0.75)
t = np.bool_(True)
y = dropout(x, r, training_mode=t)
expect(node, inputs=[x, r, t], outputs=[y], name="test_training_dropout")
```

</details>


<details>
<summary>training_default</summary>

```python
seed = np.int64(0)
node = onnx.helper.make_node(
    "Dropout", inputs=["x", "r", "t"], outputs=["y"], seed=seed
)

x = np.random.randn(3, 4, 5).astype(np.float32)
r = np.float32(0.5)
t = np.bool_(True)
y = dropout(x, r, training_mode=t)
expect(
    node, inputs=[x, r, t], outputs=[y], name="test_training_dropout_default"
)
```

</details>


<details>
<summary>training_default_ratio_mask</summary>

```python
seed = np.int64(0)
node = onnx.helper.make_node(
    "Dropout", inputs=["x", "r", "t"], outputs=["y", "z"], seed=seed
)

x = np.random.randn(3, 4, 5).astype(np.float32)
r = np.float32(0.5)
t = np.bool_(True)
y, z = dropout(x, r, training_mode=t, return_mask=True)
expect(
    node,
    inputs=[x, r, t],
    outputs=[y, z],
    name="test_training_dropout_default_mask",
)
```

</details>


<details>
<summary>training_default_zero_ratio</summary>

```python
seed = np.int64(0)
node = onnx.helper.make_node(
    "Dropout", inputs=["x", "r", "t"], outputs=["y"], seed=seed
)

x = np.random.randn(3, 4, 5).astype(np.float32)
r = np.float32(0.0)
t = np.bool_(True)
y = dropout(x, r, training_mode=t)
expect(
    node, inputs=[x, r, t], outputs=[y], name="test_training_dropout_zero_ratio"
)
```

</details>


<details>
<summary>training_default_zero_ratio_mask</summary>

```python
seed = np.int64(0)
node = onnx.helper.make_node(
    "Dropout", inputs=["x", "r", "t"], outputs=["y", "z"], seed=seed
)

x = np.random.randn(3, 4, 5).astype(np.float32)
r = np.float32(0.0)
t = np.bool_(True)
y, z = dropout(x, r, training_mode=t, return_mask=True)
expect(
    node,
    inputs=[x, r, t],
    outputs=[y, z],
    name="test_training_dropout_zero_ratio_mask",
)
```

</details>


<details>
<summary>training_ratio_mask</summary>

```python
seed = np.int64(0)
node = onnx.helper.make_node(
    "Dropout", inputs=["x", "r", "t"], outputs=["y", "z"], seed=seed
)

x = np.random.randn(3, 4, 5).astype(np.float32)
r = np.float32(0.75)
t = np.bool_(True)
y, z = dropout(x, r, training_mode=t, return_mask=True)
expect(
    node, inputs=[x, r, t], outputs=[y, z], name="test_training_dropout_mask"
)
```

</details>


### <a name="DynamicQuantizeLinear"></a><a name="dynamicquantizelinear">**DynamicQuantizeLinear**</a>

  A Function to fuse calculation for Scale, Zero Point and FP32->8Bit conversion of FP32 Input data.
  Outputs Scale, ZeroPoint and Quantized Input for a given FP32 Input.
  Scale is calculated as:
  ```
  y_scale = (maximum(0, max(x)) - minimum(0, min(x))) / (qmax - qmin)
  ```

  * where qmax and qmin are max and min values for quantization range i.e. [0, 255] in case of uint8
  * data range is adjusted to include 0.

  Zero point is calculated as:
  ```
  intermediate_zero_point = qmin - min(x)/y_scale
  y_zero_point = cast(round(saturate(itermediate_zero_point)))
  ```

  * where qmax and qmin are max and min values for quantization range .i.e [0, 255] in case of uint8
  * for saturation, it saturates to [0, 255] if it's uint8, or [-127, 127] if it's int8. Right now only uint8 is supported.
  * rounding to nearest ties to even.

  Data quantization formula is:
  ```
  y = saturate (round (x / y_scale) + y_zero_point)
  ```

  * for saturation, it saturates to [0, 255] if it's uint8, or [-127, 127] if it's int8. Right now only uint8 is supported.
  * rounding to nearest ties to even.

#### Version

This version of the operator has been available since version 11 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>x</tt> : T1</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>y</tt> : T2</dt>
<dd>Quantized output tensor</dd>
<dt><tt>y_scale</tt> : tensor(float)</dt>
<dd>Output scale. It's a scalar, which means a per-tensor/layer quantization.</dd>
<dt><tt>y_zero_point</tt> : T2</dt>
<dd>Output zero point. It's a scalar, which means a per-tensor/layer quantization.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(float)</dt>
<dd>Constrain 'x' to float tensor.</dd>
<dt><tt>T2</tt> : tensor(uint8)</dt>
<dd>Constrain 'y_zero_point' and 'y' to 8-bit unsigned integer tensor.</dd>
</dl>


#### Examples

<details>
<summary>dynamicquantizelinear</summary>

```python
node = onnx.helper.make_node(
    "DynamicQuantizeLinear",
    inputs=["x"],
    outputs=["y", "y_scale", "y_zero_point"],
)

# expected scale 0.0196078438 and zero point 153
X = np.array([0, 2, -3, -2.5, 1.34, 0.5]).astype(np.float32)
x_min = np.minimum(0, np.min(X))
x_max = np.maximum(0, np.max(X))
Y_Scale = np.float32((x_max - x_min) / (255 - 0))  # uint8 -> [0, 255]
Y_ZeroPoint = np.clip(round((0 - x_min) / Y_Scale), 0, 255).astype(np.uint8)
Y = np.clip(np.round(X / Y_Scale) + Y_ZeroPoint, 0, 255).astype(np.uint8)

expect(
    node,
    inputs=[X],
    outputs=[Y, Y_Scale, Y_ZeroPoint],
    name="test_dynamicquantizelinear",
)

# expected scale 0.0156862754 and zero point 255
X = np.array([-1.0, -2.1, -1.3, -2.5, -3.34, -4.0]).astype(np.float32)
x_min = np.minimum(0, np.min(X))
x_max = np.maximum(0, np.max(X))
Y_Scale = np.float32((x_max - x_min) / (255 - 0))  # uint8 -> [0, 255]
Y_ZeroPoint = np.clip(round((0 - x_min) / Y_Scale), 0, 255).astype(np.uint8)
Y = np.clip(np.round(X / Y_Scale) + Y_ZeroPoint, 0, 255).astype(np.uint8)

expect(
    node,
    inputs=[X],
    outputs=[Y, Y_Scale, Y_ZeroPoint],
    name="test_dynamicquantizelinear_max_adjusted",
)

X = (
    np.array([1, 2.1, 1.3, 2.5, 3.34, 4.0, 1.5, 2.6, 3.9, 4.0, 3.0, 2.345])
    .astype(np.float32)
    .reshape((3, 4))
)

# expected scale 0.0156862754 and zero point 0
x_min = np.minimum(0, np.min(X))
x_max = np.maximum(0, np.max(X))
Y_Scale = np.float32((x_max - x_min) / (255 - 0))  # uint8 -> [0, 255]
Y_ZeroPoint = np.clip(round((0 - x_min) / Y_Scale), 0, 255).astype(np.uint8)
Y = np.clip(np.round(X / Y_Scale) + Y_ZeroPoint, 0, 255).astype(np.uint8)

expect(
    node,
    inputs=[X],
    outputs=[Y, Y_Scale, Y_ZeroPoint],
    name="test_dynamicquantizelinear_min_adjusted",
)
```

</details>


### <a name="Einsum"></a><a name="einsum">**Einsum**</a>

  An einsum of the form `term1, term2 -> output-term` produces an output tensor using the following equation

  ```
  output[output-term] = reduce-sum( input1[term1] * input2[term2] )
  ```

  where the reduce-sum performs a summation over all the indices occurring in the input terms (term1, term2)
  that do not occur in the output-term.

  The Einsum operator evaluates algebraic tensor operations on a sequence of tensors, using the Einstein summation
  convention. The equation string contains a comma-separated sequence of lower case letters. Each term corresponds to
  an operand tensor, and the characters within the terms correspond to operands dimensions.

  This sequence may be followed by "->" to separate the left and right hand side of the equation.
  If the equation contains "->" followed by the right-hand side, the explicit (not classical) form of the Einstein
  summation is performed, and the right-hand side indices indicate output tensor dimensions. In other cases,
  output indices are (implicitly) set to the alphabetically sorted sequence of indices appearing exactly once in the
  equation.

  When a dimension character is repeated in the left-hand side, it represents summation along the dimension.

  The equation may contain ellipsis ("...") to enable broadcasting. Ellipsis must indicate a fixed number of dimensions.
  Specifically, every occurrence of ellipsis in the equation must represent the same number of dimensions.
  The right-hand side may contain exactly one ellipsis. In implicit mode, the ellipsis dimensions are set to the
  beginning of the output. The equation string may contain space (U+0020) character.

#### Version

This version of the operator has been available since version 12 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>equation</tt> : string (required)</dt>
<dd>Einsum expression string.</dd>
</dl>

#### Inputs (1 - &#8734;)

<dl>
<dt><tt>Inputs</tt> (variadic, differentiable) : T</dt>
<dd>Operands</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Output</tt> (differentiable) : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to all numerical tensor types.</dd>
</dl>


#### Examples

<details>
<summary>einsum_batch_diagonal</summary>

```python
Eqn = "...ii ->...i"
node = onnx.helper.make_node(
    "Einsum", inputs=["x"], outputs=["y"], equation=Eqn
)

X = np.random.randn(3, 5, 5)
Z = einsum_reference_implementation(Eqn, (X,))

expect(node, inputs=[X], outputs=[Z], name="test_einsum_batch_diagonal")
```

</details>


<details>
<summary>einsum_batch_matmul</summary>

```python
Eqn = "bij, bjk -> bik"
node = onnx.helper.make_node(
    "Einsum", inputs=["x", "y"], outputs=["z"], equation=Eqn
)

X = np.random.randn(5, 2, 3)
Y = np.random.randn(5, 3, 4)
Z = einsum_reference_implementation(Eqn, (X, Y))

expect(node, inputs=[X, Y], outputs=[Z], name="test_einsum_batch_matmul")
```

</details>


<details>
<summary>einsum_inner_prod</summary>

```python
Eqn = "i,i"
node = onnx.helper.make_node(
    "Einsum", inputs=["x", "y"], outputs=["z"], equation=Eqn
)

X = np.random.randn(5)
Y = np.random.randn(5)
Z = einsum_reference_implementation(Eqn, (X, Y))

expect(node, inputs=[X, Y], outputs=[Z], name="test_einsum_inner_prod")
```

</details>


<details>
<summary>einsum_sum</summary>

```python
Eqn = "ij->i"
node = onnx.helper.make_node(
    "Einsum", inputs=["x"], outputs=["y"], equation=Eqn
)

X = np.random.randn(3, 4)
Z = einsum_reference_implementation(Eqn, (X,))

expect(node, inputs=[X], outputs=[Z], name="test_einsum_sum")
```

</details>


<details>
<summary>einsum_transpose</summary>

```python
Eqn = "ij->ji"
node = onnx.helper.make_node(
    "Einsum", inputs=["x"], outputs=["y"], equation=Eqn
)

X = np.random.randn(3, 4)
Y = einsum_reference_implementation(Eqn, (X,))

expect(node, inputs=[X], outputs=[Y], name="test_einsum_transpose")
```

</details>


### <a name="Elu"></a><a name="elu">**Elu**</a>

  Elu takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the function `f(x) = alpha * (exp(x) - 1.) for x <
  0`, `f(x) = x for x >= 0`., is applied to the tensor elementwise.


#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Elu-1">1</a>

#### Attributes

<dl>
<dt><tt>alpha</tt> : float (default is 1.0)</dt>
<dd>Coefficient of ELU.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>1D input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>1D output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>elu</summary>

```python
node = onnx.helper.make_node("Elu", inputs=["x"], outputs=["y"], alpha=2.0)

x = np.array([-1, 0, 1]).astype(np.float32)
# expected output [-1.2642411, 0., 1.]
y = np.clip(x, 0, np.inf) + (np.exp(np.clip(x, -np.inf, 0)) - 1) * 2.0
expect(node, inputs=[x], outputs=[y], name="test_elu_example")

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.clip(x, 0, np.inf) + (np.exp(np.clip(x, -np.inf, 0)) - 1) * 2.0
expect(node, inputs=[x], outputs=[y], name="test_elu")
```

</details>


<details>
<summary>elu_default</summary>

```python
default_alpha = 1.0
node = onnx.helper.make_node(
    "Elu",
    inputs=["x"],
    outputs=["y"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.clip(x, 0, np.inf) + (np.exp(np.clip(x, -np.inf, 0)) - 1) * default_alpha
expect(node, inputs=[x], outputs=[y], name="test_elu_default")
```

</details>


### <a name="Equal"></a><a name="equal">**Equal**</a>

  Returns the tensor resulted from performing the `equal` logical operation
  elementwise on the input tensors `A` and `B` (with Numpy-style broadcasting support).

  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 19 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Equal-1">1</a>, <a href="Changelog.md#Equal-7">7</a>, <a href="Changelog.md#Equal-11">11</a>, <a href="Changelog.md#Equal-13">13</a>

#### Inputs

<dl>
<dt><tt>A</tt> (non-differentiable) : T</dt>
<dd>First input operand for the logical operator.</dd>
<dt><tt>B</tt> (non-differentiable) : T</dt>
<dd>Second input operand for the logical operator.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> (non-differentiable) : T1</dt>
<dd>Result tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(bool), tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16), tensor(string)</dt>
<dd>Constrain input types to all (non-complex) tensors.</dd>
<dt><tt>T1</tt> : tensor(bool)</dt>
<dd>Constrain output to boolean tensor.</dd>
</dl>


#### Examples

<details>
<summary>equal</summary>

```python
node = onnx.helper.make_node(
    "Equal",
    inputs=["x", "y"],
    outputs=["z"],
)

x = (np.random.randn(3, 4, 5) * 10).astype(np.int32)
y = (np.random.randn(3, 4, 5) * 10).astype(np.int32)
z = np.equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_equal")
```

</details>


<details>
<summary>equal_broadcast</summary>

```python
node = onnx.helper.make_node(
    "Equal",
    inputs=["x", "y"],
    outputs=["z"],
)

x = (np.random.randn(3, 4, 5) * 10).astype(np.int32)
y = (np.random.randn(5) * 10).astype(np.int32)
z = np.equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_equal_bcast")
```

</details>


<details>
<summary>equal_string</summary>

```python
node = onnx.helper.make_node(
    "Equal",
    inputs=["x", "y"],
    outputs=["z"],
)
x = np.array(["string1", "string2"], dtype=np.dtype(object))
y = np.array(["string1", "string3"], dtype=np.dtype(object))
z = np.equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_equal_string")
```

</details>


<details>
<summary>equal_string_broadcast</summary>

```python
node = onnx.helper.make_node(
    "Equal",
    inputs=["x", "y"],
    outputs=["z"],
)
x = np.array(["string1", "string2"], dtype=np.dtype(object))
y = np.array(["string1"], dtype=np.dtype(object))
z = np.equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_equal_string_broadcast")
```

</details>


### <a name="Erf"></a><a name="erf">**Erf**</a>

  Computes the error function of the given input tensor element-wise.

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Erf-9">9</a>

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>The error function of the input tensor computed element-wise. It has the same shape and type of the input.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to all numeric tensors.</dd>
</dl>


#### Examples

<details>
<summary>erf</summary>

```python
node = onnx.helper.make_node(
    "Erf",
    inputs=["x"],
    outputs=["y"],
)

x = np.random.randn(1, 3, 32, 32).astype(np.float32)
y = np.vectorize(math.erf)(x).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_erf")
```

</details>


### <a name="Exp"></a><a name="exp">**Exp**</a>

  Calculates the exponential of the given input tensor, element-wise.

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Exp-1">1</a>, <a href="Changelog.md#Exp-6">6</a>

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>The exponential of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>exp</summary>

```python
node = onnx.helper.make_node(
    "Exp",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([-1, 0, 1]).astype(np.float32)
y = np.exp(x)  # expected output [0.36787945, 1., 2.71828175]
expect(node, inputs=[x], outputs=[y], name="test_exp_example")

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.exp(x)
expect(node, inputs=[x], outputs=[y], name="test_exp")
```

</details>


### <a name="Expand"></a><a name="expand">**Expand**</a>

  Broadcast the input tensor following the given shape and the broadcast rule.
  The broadcast rule is similar to numpy.array(input) * numpy.ones(shape):
  Dimensions are right alignment;
  Two corresponding dimensions must have the same value, or one of them is equal to 1.
  Also, this operator is similar to numpy.broadcast_to(input, shape),
  but the major difference is numpy.broadcast_to() does not allow shape to be smaller than input.size().
  It is possible that the output.shape is not equal to shape, when some dimensions in shape is equal to 1,
  or the shape.ndim < input.shape.ndim.

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Expand-8">8</a>

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
<dt><tt>shape</tt> (non-differentiable) : tensor(int64)</dt>
<dd>A 1-D tensor indicates the shape you want to expand to, following the broadcast rule</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain input and output types to all tensors.</dd>
</dl>


#### Examples

<details>
<summary>dim_changed</summary>

```python
node = onnx.helper.make_node(
    "Expand",
    inputs=["data", "new_shape"],
    outputs=["expanded"],
)
shape = [3, 1]
data = np.reshape(np.arange(1, np.prod(shape) + 1, dtype=np.float32), shape)
# print(data)
# [[1.], [2.], [3.]]
new_shape = [2, 1, 6]
expanded = data * np.ones(new_shape, dtype=np.float32)
# print(expanded)
# [[[1., 1., 1., 1., 1., 1.],
#  [2., 2., 2., 2., 2., 2.],
#  [3., 3., 3., 3., 3., 3.]],
#
# [[1., 1., 1., 1., 1., 1.],
#  [2., 2., 2., 2., 2., 2.],
#  [3., 3., 3., 3., 3., 3.]]]
new_shape = np.array(new_shape, dtype=np.int64)
expect(
    node,
    inputs=[data, new_shape],
    outputs=[expanded],
    name="test_expand_dim_changed",
)
```

</details>


<details>
<summary>dim_unchanged</summary>

```python
node = onnx.helper.make_node(
    "Expand",
    inputs=["data", "new_shape"],
    outputs=["expanded"],
)
shape = [3, 1]
new_shape = [3, 4]
data = np.reshape(np.arange(1, np.prod(shape) + 1, dtype=np.float32), shape)
# print(data)
# [[1.], [2.], [3.]]
expanded = np.tile(data, 4)
# print(expanded)
# [[1., 1., 1., 1.],
# [2., 2., 2., 2.],
# [3., 3., 3., 3.]]
new_shape = np.array(new_shape, dtype=np.int64)
expect(
    node,
    inputs=[data, new_shape],
    outputs=[expanded],
    name="test_expand_dim_unchanged",
)
```

</details>


### <a name="EyeLike"></a><a name="eyelike">**EyeLike**</a>

  Generate a 2D tensor (matrix) with ones on the diagonal and zeros everywhere else. Only 2D
  tensors are supported, i.e. input T1 must be of rank 2. The shape of the output tensor is the
  same as the input tensor. The data type can be specified by the 'dtype' argument. If
  'dtype' is not specified, then the type of input tensor is used. By default, the main diagonal
  is populated with ones, but attribute 'k' can be used to populate upper or lower diagonals.
  The 'dtype' argument must be one of the data types specified in the 'DataType' enum field in the
  TensorProto message and be valid as an output type.

#### Version

This version of the operator has been available since version 9 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>dtype</tt> : int</dt>
<dd>(Optional) The data type for the elements of the output tensor. If not specified,the data type of the input tensor T1 is used. If input tensor T1 is also notspecified, then type defaults to 'float'.</dd>
<dt><tt>k</tt> : int (default is 0)</dt>
<dd>(Optional) Index of the diagonal to be populated with ones. Default is 0. If T2 is the output, this op sets T2[i, i+k] = 1. k = 0 populates the main diagonal, k > 0 populates an upper diagonal,  and k < 0 populates a lower diagonal.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T1</dt>
<dd>2D input tensor to copy shape, and optionally, type information from.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T2</dt>
<dd>Output tensor, same shape as input tensor T1.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(float16), tensor(float), tensor(double), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(bool)</dt>
<dd>Constrain input types. Strings and complex are not supported.</dd>
<dt><tt>T2</tt> : tensor(float16), tensor(float), tensor(double), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(bool)</dt>
<dd>Constrain output types. Strings and complex are not supported.</dd>
</dl>


#### Examples

<details>
<summary>populate_off_main_diagonal</summary>

```python
shape = (4, 5)
off_diagonal_offset = 1
node = onnx.helper.make_node(
    "EyeLike",
    inputs=["x"],
    outputs=["y"],
    k=off_diagonal_offset,
    dtype=onnx.TensorProto.FLOAT,
)

x = np.random.randint(0, 100, size=shape, dtype=np.int32)
y = np.eye(shape[0], shape[1], k=off_diagonal_offset, dtype=np.float32)
expect(
    node,
    inputs=[x],
    outputs=[y],
    name="test_eyelike_populate_off_main_diagonal",
)
```

</details>


<details>
<summary>with_dtype</summary>

```python
shape = (3, 4)
node = onnx.helper.make_node(
    "EyeLike",
    inputs=["x"],
    outputs=["y"],
    dtype=onnx.TensorProto.DOUBLE,
)

x = np.random.randint(0, 100, size=shape, dtype=np.int32)
y = np.eye(shape[0], shape[1], dtype=np.float64)
expect(node, inputs=[x], outputs=[y], name="test_eyelike_with_dtype")
```

</details>


<details>
<summary>without_dtype</summary>

```python
shape = (4, 4)
node = onnx.helper.make_node(
    "EyeLike",
    inputs=["x"],
    outputs=["y"],
)

x = np.random.randint(0, 100, size=shape, dtype=np.int32)
y = np.eye(shape[0], shape[1], dtype=np.int32)
expect(node, inputs=[x], outputs=[y], name="test_eyelike_without_dtype")
```

</details>


### <a name="Flatten"></a><a name="flatten">**Flatten**</a>

  Flattens the input tensor into a 2D matrix. If input tensor has shape
  (d_0, d_1, ... d_n) then the output will have shape
  (d_0 X d_1 ... d_(axis-1), d_axis X d_(axis+1) ... X dn).

#### Version

This version of the operator has been available since version 21 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Flatten-1">1</a>, <a href="Changelog.md#Flatten-9">9</a>, <a href="Changelog.md#Flatten-11">11</a>, <a href="Changelog.md#Flatten-13">13</a>

#### Attributes

<dl>
<dt><tt>axis</tt> : int (default is 1)</dt>
<dd>Indicate up to which input dimensions (exclusive) should be flattened to the outer dimension of the output. The value for axis must be in the range [-r, r], where r is the rank of the input tensor. Negative value means counting dimensions from the back. When axis = 0, the shape of the output tensor is (1, (d_0 X d_1 ... d_n), where the shape of the input tensor is (d_0, d_1, ... d_n). </dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>A tensor of rank >= axis.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>A 2D tensor with the contents of the input tensor, with input dimensions up to axis flattened to the outer dimension of the output and remaining input dimensions flattened into the inner dimension of the output.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4)</dt>
<dd>Constrain input and output to all tensor types up to IRv10.</dd>
</dl>


#### Examples

<details>
<summary>flatten</summary>

```python
shape = (2, 3, 4, 5)
a = np.random.random_sample(shape).astype(np.float32)

for i in range(len(shape)):
    node = onnx.helper.make_node(
        "Flatten",
        inputs=["a"],
        outputs=["b"],
        axis=i,
    )

    new_shape = (1, -1) if i == 0 else (np.prod(shape[0:i]).astype(int), -1)
    b = np.reshape(a, new_shape)
    expect(node, inputs=[a], outputs=[b], name="test_flatten_axis" + str(i))
```

</details>


<details>
<summary>flatten_negative_axis</summary>

```python
shape = (2, 3, 4, 5)
a = np.random.random_sample(shape).astype(np.float32)

for i in range(-len(shape), 0):
    node = onnx.helper.make_node(
        "Flatten",
        inputs=["a"],
        outputs=["b"],
        axis=i,
    )

    new_shape = (np.prod(shape[0:i]).astype(int), -1)
    b = np.reshape(a, new_shape)
    expect(
        node,
        inputs=[a],
        outputs=[b],
        name="test_flatten_negative_axis" + str(abs(i)),
    )
```

</details>


<details>
<summary>flatten_with_default_axis</summary>

```python
node = onnx.helper.make_node(
    "Flatten",
    inputs=["a"],
    outputs=["b"],  # Default value for axis: axis=1
)

shape = (5, 4, 3, 2)
a = np.random.random_sample(shape).astype(np.float32)
new_shape = (5, 24)
b = np.reshape(a, new_shape)
expect(node, inputs=[a], outputs=[b], name="test_flatten_default_axis")
```

</details>


### <a name="Floor"></a><a name="floor">**Floor**</a>

  Floor takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the floor is, y = floor(x), is applied to
  the tensor elementwise. If x is integral, +0, -0, NaN,  or infinite, x itself is returned.

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Floor-1">1</a>, <a href="Changelog.md#Floor-6">6</a>

#### Inputs

<dl>
<dt><tt>X</tt> (non-differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (non-differentiable) : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>floor</summary>

```python
node = onnx.helper.make_node(
    "Floor",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([-1.5, 1.2, 2]).astype(np.float32)
y = np.floor(x)  # expected output [-2., 1., 2.]
expect(node, inputs=[x], outputs=[y], name="test_floor_example")

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.floor(x)
expect(node, inputs=[x], outputs=[y], name="test_floor")
```

</details>


### <a name="GRU"></a><a name="gru">**GRU**</a>

  Computes an one-layer GRU. This operator is usually supported via some custom
  implementation such as CuDNN.

  Notations:

  * `X` - input tensor
  * `z` - update gate
  * `r` - reset gate
  * `h` - hidden gate
  * `t` - time step (t-1 means previous time step)
  * `W[zrh]` - W parameter weight matrix for update, reset, and hidden gates
  * `R[zrh]` - R recurrence weight matrix for update, reset, and hidden gates
  * `Wb[zrh]` - W bias vectors for update, reset, and hidden gates
  * `Rb[zrh]` - R bias vectors for update, reset, and hidden gates
  * `WB[zrh]` - W parameter weight matrix for backward update, reset, and hidden gates
  * `RB[zrh]` - R recurrence weight matrix for backward update, reset, and hidden gates
  * `WBb[zrh]` - W bias vectors for backward update, reset, and hidden gates
  * `RBb[zrh]` - R bias vectors for backward update, reset, and hidden gates
  * `H` - Hidden state
  * `num_directions` - 2 if direction == bidirectional else 1

  Activation functions:

  * Relu(x)                - max(0, x)
  * Tanh(x)                - (1 - e^{-2x})/(1 + e^{-2x})
  * Sigmoid(x)             - 1/(1 + e^{-x})

  NOTE:
    Below are optional

  * Affine(x)              - alpha * x + beta
  * LeakyRelu(x)           - x if x >= 0 else alpha * x
  * ThresholdedRelu(x)     - x if x >= alpha else 0
  * ScaledTanh(x)          - alpha * Tanh(beta * x)
  * HardSigmoid(x)         - min(max(alpha * x + beta, 0), 1)
  * Elu(x)                 - x if x >= 0 else alpha * (e^x - 1)
  * Softsign(x)            - x/(1 + |x|)
  * Softplus(x)            - log(1 + e^x)

  Equations (Default: f=Sigmoid, g=Tanh):

  * zt = f(Xt*(Wz^T) + Ht-1*(Rz^T) + Wbz + Rbz)
  * rt = f(Xt*(Wr^T) + Ht-1*(Rr^T) + Wbr + Rbr)
  * ht = g(Xt*(Wh^T) + (rt (.) Ht-1)*(Rh^T) + Rbh + Wbh) # default, when linear_before_reset = 0
  * ht = g(Xt*(Wh^T) + (rt (.) (Ht-1*(Rh^T) + Rbh)) + Wbh) # when linear_before_reset != 0
  * Ht = (1 - zt) (.) ht + zt (.) Ht-1
  This operator has **optional** inputs/outputs. See [the doc](IR.md) for more details about the representation of optional arguments. An empty string may be used in the place of an actual argument's name to indicate a missing argument. Trailing optional arguments (those not followed by an argument that is present) may also be simply omitted.

#### Version

This version of the operator has been available since version 14 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#GRU-1">1</a>, <a href="Changelog.md#GRU-3">3</a>, <a href="Changelog.md#GRU-7">7</a>

#### Attributes

<dl>
<dt><tt>activation_alpha</tt> : list of floats</dt>
<dd>Optional scaling values used by some activation functions. The values are consumed in the order of activation functions, for example (f, g, h) in LSTM. Default values are the same as of corresponding ONNX operators.For example with LeakyRelu, the default alpha is 0.01.</dd>
<dt><tt>activation_beta</tt> : list of floats</dt>
<dd>Optional scaling values used by some activation functions. The values are consumed in the order of activation functions, for example (f, g, h) in LSTM. Default values are the same as of corresponding ONNX operators.</dd>
<dt><tt>activations</tt> : list of strings</dt>
<dd>A list of 2 (or 4 if bidirectional) activation functions for update, reset, and hidden gates. The activation functions must be one of the activation functions specified above. Optional: See the equations for default if not specified.</dd>
<dt><tt>clip</tt> : float</dt>
<dd>Cell clip threshold. Clipping bounds the elements of a tensor in the range of [-threshold, +threshold] and is applied to the input of activations. No clip if not specified.</dd>
<dt><tt>direction</tt> : string (default is forward)</dt>
<dd>Specify if the RNN is forward, reverse, or bidirectional. Must be one of forward (default), reverse, or bidirectional.</dd>
<dt><tt>hidden_size</tt> : int</dt>
<dd>Number of neurons in the hidden layer</dd>
<dt><tt>layout</tt> : int (default is 0)</dt>
<dd>The shape format of inputs X, initial_h and outputs Y, Y_h. If 0, the following shapes are expected: X.shape = [seq_length, batch_size, input_size], Y.shape = [seq_length, num_directions, batch_size, hidden_size], initial_h.shape = Y_h.shape = [num_directions, batch_size, hidden_size]. If 1, the following shapes are expected: X.shape = [batch_size, seq_length, input_size], Y.shape = [batch_size, seq_length, num_directions, hidden_size], initial_h.shape = Y_h.shape = [batch_size, num_directions, hidden_size].</dd>
<dt><tt>linear_before_reset</tt> : int (default is 0)</dt>
<dd>When computing the output of the hidden gate, apply the linear transformation before multiplying by the output of the reset gate.</dd>
</dl>

#### Inputs (3 - 6)

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>The input sequences packed (and potentially padded) into one 3-D tensor with the shape of `[seq_length, batch_size, input_size]`.</dd>
<dt><tt>W</tt> (differentiable) : T</dt>
<dd>The weight tensor for the gates. Concatenation of `W[zrh]` and `WB[zrh]` (if bidirectional) along dimension 0. This tensor has shape `[num_directions, 3*hidden_size, input_size]`.</dd>
<dt><tt>R</tt> (differentiable) : T</dt>
<dd>The recurrence weight tensor. Concatenation of `R[zrh]` and `RB[zrh]` (if bidirectional) along dimension 0. This tensor has shape `[num_directions, 3*hidden_size, hidden_size]`.</dd>
<dt><tt>B</tt> (optional, differentiable) : T</dt>
<dd>The bias tensor for the gates. Concatenation of `[Wb[zrh], Rb[zrh]]` and `[WBb[zrh], RBb[zrh]]` (if bidirectional) along dimension 0. This tensor has shape `[num_directions, 6*hidden_size]`. Optional: If not specified - assumed to be 0</dd>
<dt><tt>sequence_lens</tt> (optional, non-differentiable) : T1</dt>
<dd>Optional tensor specifying lengths of the sequences in a batch. If not specified - assumed all sequences in the batch to have length `seq_length`. It has shape `[batch_size]`.</dd>
<dt><tt>initial_h</tt> (optional, non-differentiable) : T</dt>
<dd>Optional initial value of the hidden. If not specified - assumed to be 0. It has shape `[num_directions, batch_size, hidden_size]`.</dd>
</dl>

#### Outputs (0 - 2)

<dl>
<dt><tt>Y</tt> (optional, differentiable) : T</dt>
<dd>A tensor that concats all the intermediate output values of the hidden. It has shape `[seq_length, num_directions, batch_size, hidden_size]`. </dd>
<dt><tt>Y_h</tt> (optional, differentiable) : T</dt>
<dd>The last output value of the hidden. It has shape `[num_directions, batch_size, hidden_size]`.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
<dt><tt>T1</tt> : tensor(int32)</dt>
<dd>Constrain seq_lens to integer tensor.</dd>
</dl>


#### Examples

<details>
<summary>batchwise</summary>

```python
input = np.array([[[1.0, 2.0]], [[3.0, 4.0]], [[5.0, 6.0]]]).astype(np.float32)

input_size = 2
hidden_size = 6
number_of_gates = 3
weight_scale = 0.2
layout = 1

node = onnx.helper.make_node(
    "GRU",
    inputs=["X", "W", "R"],
    outputs=["Y", "Y_h"],
    hidden_size=hidden_size,
    layout=layout,
)

W = weight_scale * np.ones(
    (1, number_of_gates * hidden_size, input_size)
).astype(np.float32)
R = weight_scale * np.ones(
    (1, number_of_gates * hidden_size, hidden_size)
).astype(np.float32)

gru = GRUHelper(X=input, W=W, R=R, layout=layout)
Y, Y_h = gru.step()
expect(
    node,
    inputs=[input, W, R],
    outputs=[Y.astype(np.float32), Y_h.astype(np.float32)],
    name="test_gru_batchwise",
)
```

</details>


<details>
<summary>defaults</summary>

```python
input = np.array([[[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]]]).astype(np.float32)

input_size = 2
hidden_size = 5
weight_scale = 0.1
number_of_gates = 3

node = onnx.helper.make_node(
    "GRU", inputs=["X", "W", "R"], outputs=["", "Y_h"], hidden_size=hidden_size
)

W = weight_scale * np.ones(
    (1, number_of_gates * hidden_size, input_size)
).astype(np.float32)
R = weight_scale * np.ones(
    (1, number_of_gates * hidden_size, hidden_size)
).astype(np.float32)

gru = GRUHelper(X=input, W=W, R=R)
_, Y_h = gru.step()
expect(
    node,
    inputs=[input, W, R],
    outputs=[Y_h.astype(np.float32)],
    name="test_gru_defaults",
)
```

</details>


<details>
<summary>initial_bias</summary>

```python
input = np.array([[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0], [7.0, 8.0, 9.0]]]).astype(
    np.float32
)

input_size = 3
hidden_size = 3
weight_scale = 0.1
custom_bias = 0.1
number_of_gates = 3

node = onnx.helper.make_node(
    "GRU",
    inputs=["X", "W", "R", "B"],
    outputs=["", "Y_h"],
    hidden_size=hidden_size,
)

W = weight_scale * np.ones(
    (1, number_of_gates * hidden_size, input_size)
).astype(np.float32)
R = weight_scale * np.ones(
    (1, number_of_gates * hidden_size, hidden_size)
).astype(np.float32)

# Adding custom bias
W_B = custom_bias * np.ones((1, number_of_gates * hidden_size)).astype(
    np.float32
)
R_B = np.zeros((1, number_of_gates * hidden_size)).astype(np.float32)
B = np.concatenate((W_B, R_B), axis=1)

gru = GRUHelper(X=input, W=W, R=R, B=B)
_, Y_h = gru.step()
expect(
    node,
    inputs=[input, W, R, B],
    outputs=[Y_h.astype(np.float32)],
    name="test_gru_with_initial_bias",
)
```

</details>


<details>
<summary>seq_length</summary>

```python
input = np.array(
    [
        [[1.0, 2.0, 3.0], [4.0, 5.0, 6.0], [7.0, 8.0, 9.0]],
        [[10.0, 11.0, 12.0], [13.0, 14.0, 15.0], [16.0, 17.0, 18.0]],
    ]
).astype(np.float32)

input_size = 3
hidden_size = 5
number_of_gates = 3

node = onnx.helper.make_node(
    "GRU",
    inputs=["X", "W", "R", "B"],
    outputs=["", "Y_h"],
    hidden_size=hidden_size,
)

W = np.random.randn(1, number_of_gates * hidden_size, input_size).astype(
    np.float32
)
R = np.random.randn(1, number_of_gates * hidden_size, hidden_size).astype(
    np.float32
)

# Adding custom bias
W_B = np.random.randn(1, number_of_gates * hidden_size).astype(np.float32)
R_B = np.random.randn(1, number_of_gates * hidden_size).astype(np.float32)
B = np.concatenate((W_B, R_B), axis=1)

gru = GRUHelper(X=input, W=W, R=R, B=B)
_, Y_h = gru.step()
expect(
    node,
    inputs=[input, W, R, B],
    outputs=[Y_h.astype(np.float32)],
    name="test_gru_seq_length",
)
```

</details>


### <a name="Gather"></a><a name="gather">**Gather**</a>

  Given `data` tensor of rank r >= 1, and `indices` tensor of rank q, gather
  entries of the axis dimension of `data` (by default outer-most one as axis=0) indexed by `indices`, and concatenates
  them in an output tensor of rank q + (r - 1).

  If `axis = 0`, let `k = indices[i_{0}, ..., i_{q-1}]`
  then `output[i_{0}, ..., i_{q-1}, j_{0}, ..., j_{r-2}] = input[k , j_{0}, ..., j_{r-2}]`:

  ```
  data = [
      [1.0, 1.2],
      [2.3, 3.4],
      [4.5, 5.7],
  ]
  indices = [
      [0, 1],
      [1, 2],
  ]
  output = [
      [
          [1.0, 1.2],
          [2.3, 3.4],
      ],
      [
          [2.3, 3.4],
          [4.5, 5.7],
      ],
  ]
  ```

  If `axis = 1`, let `k = indices[i_{0}, ..., i_{q-1}]`
  then `output[j_{0}, i_{0}, ..., i_{q-1}, j_{1}, ..., j_{r-2}] = input[j_{0}, k, j_{1}, ..., j_{r-2}]`:

  ```
  data = [
      [1.0, 1.2, 1.9],
      [2.3, 3.4, 3.9],
      [4.5, 5.7, 5.9],
  ]
  indices = [
      [0, 2],
  ]
  axis = 1,
  output = [
          [[1.0, 1.9]],
          [[2.3, 3.9]],
          [[4.5, 5.9]],
  ]
  ```

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Gather-1">1</a>, <a href="Changelog.md#Gather-11">11</a>

#### Attributes

<dl>
<dt><tt>axis</tt> : int (default is 0)</dt>
<dd>Which axis to gather on. Negative value means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(data).</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> (differentiable) : T</dt>
<dd>Tensor of rank r >= 1.</dd>
<dt><tt>indices</tt> (non-differentiable) : Tind</dt>
<dd>Tensor of int32/int64 indices, of any rank q. All index values are expected to be within bounds [-s, s-1] along axis of size s. It is an error if any of the index values are out of bounds.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>Tensor of rank q + (r - 1).</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain input and output types to any tensor type.</dd>
<dt><tt>Tind</tt> : tensor(int32), tensor(int64)</dt>
<dd>Constrain indices to integer types</dd>
</dl>


#### Examples

<details>
<summary>gather_0</summary>

```python
node = onnx.helper.make_node(
    "Gather",
    inputs=["data", "indices"],
    outputs=["y"],
    axis=0,
)
data = np.random.randn(5, 4, 3, 2).astype(np.float32)
indices = np.array([0, 1, 3])
y = np.take(data, indices, axis=0)

expect(
    node,
    inputs=[data, indices.astype(np.int64)],
    outputs=[y],
    name="test_gather_0",
)
```

</details>


<details>
<summary>gather_1</summary>

```python
node = onnx.helper.make_node(
    "Gather",
    inputs=["data", "indices"],
    outputs=["y"],
    axis=1,
)
data = np.random.randn(5, 4, 3, 2).astype(np.float32)
indices = np.array([0, 1, 3])
y = np.take(data, indices, axis=1)

expect(
    node,
    inputs=[data, indices.astype(np.int64)],
    outputs=[y],
    name="test_gather_1",
)
```

</details>


<details>
<summary>gather_2d_indices</summary>

```python
node = onnx.helper.make_node(
    "Gather",
    inputs=["data", "indices"],
    outputs=["y"],
    axis=1,
)
data = np.random.randn(3, 3).astype(np.float32)
indices = np.array([[0, 2]])
y = np.take(data, indices, axis=1)

expect(
    node,
    inputs=[data, indices.astype(np.int64)],
    outputs=[y],
    name="test_gather_2d_indices",
)
```

</details>


<details>
<summary>gather_negative_indices</summary>

```python
node = onnx.helper.make_node(
    "Gather",
    inputs=["data", "indices"],
    outputs=["y"],
    axis=0,
)
data = np.arange(10).astype(np.float32)
indices = np.array([0, -9, -10])
y = np.take(data, indices, axis=0)

# print(y)
# [0. 1. 0.]

expect(
    node,
    inputs=[data, indices.astype(np.int64)],
    outputs=[y],
    name="test_gather_negative_indices",
)
```

</details>


### <a name="GatherElements"></a><a name="gatherelements">**GatherElements**</a>

  GatherElements takes two inputs `data` and `indices` of the same rank r >= 1
  and an optional attribute `axis` that identifies an axis of `data`
  (by default, the outer-most axis, that is axis 0). It is an indexing operation
  that produces its output by indexing into the input data tensor at index
  positions determined by elements of the `indices` tensor.
  Its output shape is the same as the shape of `indices` and consists of one value
  (gathered from the `data`) for each element in `indices`.

  For instance, in the 3-D case (r = 3), the output produced is determined
  by the following equations:
  ```
  out[i][j][k] = input[index[i][j][k]][j][k] if axis = 0,
  out[i][j][k] = input[i][index[i][j][k]][k] if axis = 1,
  out[i][j][k] = input[i][j][index[i][j][k]] if axis = 2,
  ```

  This operator is also the inverse of ScatterElements. It is similar to Torch's gather operation.

  Example 1:
  ```
  data = [
      [1, 2],
      [3, 4],
  ]
  indices = [
      [0, 0],
      [1, 0],
  ]
  axis = 1
  output = [
      [1, 1],
      [4, 3],
  ]
  ```
  Example 2:
  ```
  data = [
      [1, 2, 3],
      [4, 5, 6],
      [7, 8, 9],
  ]
  indices = [
      [1, 2, 0],
      [2, 0, 0],
  ]
  axis = 0
  output = [
      [4, 8, 3],
      [7, 2, 3],
  ]
  ```

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#GatherElements-11">11</a>

#### Attributes

<dl>
<dt><tt>axis</tt> : int (default is 0)</dt>
<dd>Which axis to gather on. Negative value means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(data).</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> (differentiable) : T</dt>
<dd>Tensor of rank r >= 1.</dd>
<dt><tt>indices</tt> (non-differentiable) : Tind</dt>
<dd>Tensor of int32/int64 indices, with the same rank r as the input. All index values are expected to be within bounds [-s, s-1] along axis of size s. It is an error if any of the index values are out of bounds.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>Tensor of the same shape as indices.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain input and output types to any tensor type.</dd>
<dt><tt>Tind</tt> : tensor(int32), tensor(int64)</dt>
<dd>Constrain indices to integer types</dd>
</dl>


#### Examples

<details>
<summary>gather_elements_0</summary>

```python
axis = 1
node = onnx.helper.make_node(
    "GatherElements",
    inputs=["data", "indices"],
    outputs=["y"],
    axis=axis,
)
data = np.array([[1, 2], [3, 4]], dtype=np.float32)
indices = np.array([[0, 0], [1, 0]], dtype=np.int32)

y = gather_elements(data, indices, axis)
# print(y) produces
# [[1, 1],
#  [4, 3]]

expect(
    node,
    inputs=[data, indices.astype(np.int64)],
    outputs=[y],
    name="test_gather_elements_0",
)
```

</details>


<details>
<summary>gather_elements_1</summary>

```python
axis = 0
node = onnx.helper.make_node(
    "GatherElements",
    inputs=["data", "indices"],
    outputs=["y"],
    axis=axis,
)
data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=np.float32)
indices = np.array([[1, 2, 0], [2, 0, 0]], dtype=np.int32)

y = gather_elements(data, indices, axis)
# print(y) produces
# [[4, 8, 3],
#  [7, 2, 3]]

expect(
    node,
    inputs=[data, indices.astype(np.int64)],
    outputs=[y],
    name="test_gather_elements_1",
)
```

</details>


<details>
<summary>gather_elements_negative_indices</summary>

```python
axis = 0
node = onnx.helper.make_node(
    "GatherElements",
    inputs=["data", "indices"],
    outputs=["y"],
    axis=axis,
)
data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=np.float32)
indices = np.array([[-1, -2, 0], [-2, 0, 0]], dtype=np.int32)

y = gather_elements(data, indices, axis)
# print(y) produces
# [[7, 5, 3],
#  [4, 2, 3]]

expect(
    node,
    inputs=[data, indices.astype(np.int64)],
    outputs=[y],
    name="test_gather_elements_negative_indices",
)
```

</details>


### <a name="GatherND"></a><a name="gathernd">**GatherND**</a>

  Given `data` tensor of rank `r` >= 1, `indices` tensor of rank `q` >= 1, and `batch_dims` integer `b`, this operator gathers
  slices of `data` into an output tensor of rank `q + r - indices_shape[-1] - 1 - b`.

  `indices` is an q-dimensional integer tensor, best thought of as a `(q-1)`-dimensional tensor of index-tuples into `data`,
  where each element defines a slice of `data`

  `batch_dims` (denoted as `b`) is an integer indicating the number of batch dimensions, i.e the leading `b` number of dimensions of
  `data` tensor and `indices` are representing the batches, and the gather starts from the `b+1` dimension.

  Some salient points about the inputs' rank and shape:

  1) r >= 1 and q >= 1 are to be honored. There is no dependency condition to be met between ranks `r` and `q`

  2) The first `b` dimensions of the shape of `indices` tensor and `data` tensor must be equal.

  3) b < min(q, r) is to be honored.

  4) The `indices_shape[-1]` should have a value between 1 (inclusive) and rank `r-b` (inclusive)

  5) All values in `indices` are expected to be within bounds [-s, s-1] along axis of size `s` (i.e.) `-data_shape[i] <= indices[...,i] <= data_shape[i] - 1`.
     It is an error if any of the index values are out of bounds.

  The output is computed as follows:

  The output tensor is obtained by mapping each index-tuple in the `indices` tensor to the corresponding slice of the input `data`.

  1) If `indices_shape[-1] > r-b` => error condition

  2) If `indices_shape[-1] == r-b`, since the rank of `indices` is `q`, `indices` can be thought of as `N` `(q-b-1)`-dimensional tensors
     containing 1-D tensors of dimension `r-b`, where `N` is an integer equals to the product of 1 and all the elements in the batch dimensions
     of the indices_shape. Let us think of each such `r-b` ranked tensor as `indices_slice`. Each *scalar value* corresponding to `data[0:b-1,indices_slice]`
     is filled into the corresponding location of the `(q-b-1)`-dimensional tensor to form the `output` tensor (Example 1 below)

  3) If `indices_shape[-1] < r-b`, since the rank of `indices` is `q`, `indices` can be thought of as `N` `(q-b-1)`-dimensional tensor
     containing 1-D tensors of dimension `< r-b`. Let us think of each such tensors as `indices_slice`. Each *tensor slice* corresponding
     to `data[0:b-1, indices_slice , :]` is filled into the corresponding location of the `(q-b-1)`-dimensional tensor
     to form the `output` tensor (Examples 2, 3, 4 and 5 below)

  This operator is the inverse of `ScatterND`.

  **Example 1**

  ```
  batch_dims = 0
  data    = [[0,1],[2,3]]   # data_shape    = [2, 2]
  indices = [[0,0],[1,1]]   # indices_shape = [2, 2]
  output  = [0,3]           # output_shape  = [2]
  ```

  **Example 2**

  ```
  batch_dims = 0
  data    = [[0,1],[2,3]]  # data_shape    = [2, 2]
  indices = [[1],[0]]      # indices_shape = [2, 1]
  output  = [[2,3],[0,1]]  # output_shape  = [2, 2]
  ```

  **Example 3**

  ```
  batch_dims = 0
  data    = [[[0,1],[2,3]],[[4,5],[6,7]]] # data_shape    = [2, 2, 2]
  indices = [[0,1],[1,0]]                 # indices_shape = [2, 2]
  output  = [[2,3],[4,5]]                 # output_shape  = [2, 2]
  ```

  **Example 4**

  ```
  batch_dims = 0
  data    = [[[0,1],[2,3]],[[4,5],[6,7]]] # data_shape    = [2, 2, 2]
  indices = [[[0,1]],[[1,0]]]             # indices_shape = [2, 1, 2]
  output  = [[[2,3]],[[4,5]]]             # output_shape  = [2, 1, 2]
  ```

  **Example 5**

  ```
  batch_dims = 1
  data    = [[[0,1],[2,3]],[[4,5],[6,7]]] # data_shape    = [2, 2, 2]
  indices = [[1],[0]]                     # indices_shape = [2, 1]
  output  = [[2,3],[4,5]]                 # output_shape  = [2, 2]
  ```

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#GatherND-11">11</a>, <a href="Changelog.md#GatherND-12">12</a>

#### Attributes

<dl>
<dt><tt>batch_dims</tt> : int (default is 0)</dt>
<dd>The number of batch dimensions. The gather of indexing starts from dimension of data[batch_dims:]</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> (differentiable) : T</dt>
<dd>Tensor of rank r >= 1.</dd>
<dt><tt>indices</tt> (non-differentiable) : tensor(int64)</dt>
<dd>Tensor of rank q >= 1. All index values are expected to be within bounds [-s, s-1] along axis of size s. It is an error if any of the index values are out of bounds.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>Tensor of rank q + r - indices_shape[-1] - 1.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain input and output types to any tensor type.</dd>
</dl>


#### Examples

<details>
<summary>float32</summary>

```python
node = onnx.helper.make_node(
    "GatherND",
    inputs=["data", "indices"],
    outputs=["output"],
)

data = np.array([[[0, 1], [2, 3]], [[4, 5], [6, 7]]], dtype=np.float32)
indices = np.array([[[0, 1]], [[1, 0]]], dtype=np.int64)
output = gather_nd_impl(data, indices, 0)
expected_output = np.array([[[2, 3]], [[4, 5]]], dtype=np.float32)
assert np.array_equal(output, expected_output)
expect(
    node,
    inputs=[data, indices],
    outputs=[output],
    name="test_gathernd_example_float32",
)
```

</details>


<details>
<summary>int32</summary>

```python
node = onnx.helper.make_node(
    "GatherND",
    inputs=["data", "indices"],
    outputs=["output"],
)

data = np.array([[0, 1], [2, 3]], dtype=np.int32)
indices = np.array([[0, 0], [1, 1]], dtype=np.int64)
output = gather_nd_impl(data, indices, 0)
expected_output = np.array([0, 3], dtype=np.int32)
assert np.array_equal(output, expected_output)
expect(
    node,
    inputs=[data, indices],
    outputs=[output],
    name="test_gathernd_example_int32",
)
```

</details>


<details>
<summary>int32_batchdim_1</summary>

```python
node = onnx.helper.make_node(
    "GatherND",
    inputs=["data", "indices"],
    outputs=["output"],
    batch_dims=1,
)

data = np.array([[[0, 1], [2, 3]], [[4, 5], [6, 7]]], dtype=np.int32)
indices = np.array([[1], [0]], dtype=np.int64)
output = gather_nd_impl(data, indices, 1)
expected_output = np.array([[2, 3], [4, 5]], dtype=np.int32)
assert np.array_equal(output, expected_output)
expect(
    node,
    inputs=[data, indices],
    outputs=[output],
    name="test_gathernd_example_int32_batch_dim1",
)
```

</details>


### <a name="Gelu"></a><a name="gelu">**Gelu**</a>

  Gelu takes one input data (Tensor<T>) and produces one
  output data (Tensor<T>) where the gaussian error linear units function,
  $y = 0.5 * x * (1 + erf(x/sqrt(2)))$ is applied to the tensor elementwise.
  If the attribute "approximate" is set to "tanh", the function estimation,
  $y = 0.5 * x * (1 + Tanh(sqrt(2/\pi) * (x + 0.044715 * x^3)))$ is used and applied
  to the tensor elementwise.


#### Version

This version of the operator has been available since version 20 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>approximate</tt> : string (default is none)</dt>
<dd>Gelu approximation algorithm: `"tanh"`, `"none"`(default).`"none"`: do not use approximation.`"tanh"`: use tanh approximation.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>gelu_default</summary>

```python
node = onnx.helper.make_node("Gelu", inputs=["x"], outputs=["y"])

x = np.array([-1, 0, 1]).astype(np.float32)
# expected output [-0.15865526, 0., 0.84134474]
y = (0.5 * x * (1 + np.vectorize(math.erf)(x / np.sqrt(2)))).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_gelu_default_1")

x = np.random.randn(3, 4, 5).astype(np.float32)
# expected output [2.99595031, 3.99987331, 4.99999857]
y = (0.5 * x * (1 + np.vectorize(math.erf)(x / np.sqrt(2)))).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_gelu_default_2")
```

</details>


<details>
<summary>gelu_tanh</summary>

```python
node = onnx.helper.make_node(
    "Gelu", inputs=["x"], outputs=["y"], approximate="tanh"
)

x = np.array([-1, 0, 1]).astype(np.float32)
# expected output [-0.158808, 0., 0.841192]
y = (
    0.5
    * x
    * (1 + np.tanh(np.sqrt(2 / np.pi) * (x + 0.044715 * np.power(x, 3))))
).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_gelu_tanh_1")

x = np.random.randn(3, 4, 5).astype(np.float32)
# expected output [2.9963627, 3.99993, 4.9999995]
y = (
    0.5
    * x
    * (1 + np.tanh(np.sqrt(2 / np.pi) * (x + 0.044715 * np.power(x, 3))))
).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_gelu_tanh_2")
```

</details>


### <a name="Gemm"></a><a name="gemm">**Gemm**</a>

  General Matrix multiplication:
  https://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms#Level_3

  * A' = transpose(A) if transA else A
  * B' = transpose(B) if transB else B

  Compute Y = alpha * A' * B' + beta * C, where input tensor A has shape (M, K) or (K, M),
  input tensor B has shape (K, N) or (N, K), input tensor C is broadcastable to shape (M, N),
  and output tensor Y has shape (M, N). A will be transposed before doing the
  computation if attribute transA is non-zero, same for B and transB.
  This operator supports **unidirectional broadcasting** (tensor C should be unidirectional broadcastable to tensor A * B); for more details please check [the doc](Broadcasting.md).
  This operator has **optional** inputs/outputs. See [the doc](IR.md) for more details about the representation of optional arguments. An empty string may be used in the place of an actual argument's name to indicate a missing argument. Trailing optional arguments (those not followed by an argument that is present) may also be simply omitted.

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Gemm-1">1</a>, <a href="Changelog.md#Gemm-6">6</a>, <a href="Changelog.md#Gemm-7">7</a>, <a href="Changelog.md#Gemm-9">9</a>, <a href="Changelog.md#Gemm-11">11</a>

#### Attributes

<dl>
<dt><tt>alpha</tt> : float (default is 1.0)</dt>
<dd>Scalar multiplier for the product of input tensors A * B.</dd>
<dt><tt>beta</tt> : float (default is 1.0)</dt>
<dd>Scalar multiplier for input tensor C.</dd>
<dt><tt>transA</tt> : int (default is 0)</dt>
<dd>Whether A should be transposed</dd>
<dt><tt>transB</tt> : int (default is 0)</dt>
<dd>Whether B should be transposed</dd>
</dl>

#### Inputs (2 - 3)

<dl>
<dt><tt>A</tt> (differentiable) : T</dt>
<dd>Input tensor A. The shape of A should be (M, K) if transA is 0, or (K, M) if transA is non-zero.</dd>
<dt><tt>B</tt> (differentiable) : T</dt>
<dd>Input tensor B. The shape of B should be (K, N) if transB is 0, or (N, K) if transB is non-zero.</dd>
<dt><tt>C</tt> (optional, differentiable) : T</dt>
<dd>Optional input tensor C. If not specified, the computation is done as if C is a scalar 0. The shape of C should be unidirectional broadcastable to (M, N).</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>Output tensor of shape (M, N).</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double), tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(bfloat16)</dt>
<dd>Constrain input and output types to float/int tensors.</dd>
</dl>


#### Examples

<details>
<summary>all_attributes</summary>

```python
node = onnx.helper.make_node(
    "Gemm",
    inputs=["a", "b", "c"],
    outputs=["y"],
    alpha=0.25,
    beta=0.35,
    transA=1,
    transB=1,
)
a = np.random.ranf([4, 3]).astype(np.float32)
b = np.random.ranf([5, 4]).astype(np.float32)
c = np.random.ranf([1, 5]).astype(np.float32)
y = gemm_reference_implementation(
    a, b, c, transA=1, transB=1, alpha=0.25, beta=0.35
)
expect(node, inputs=[a, b, c], outputs=[y], name="test_gemm_all_attributes")
```

</details>


<details>
<summary>alpha</summary>

```python
node = onnx.helper.make_node(
    "Gemm", inputs=["a", "b", "c"], outputs=["y"], alpha=0.5
)
a = np.random.ranf([3, 5]).astype(np.float32)
b = np.random.ranf([5, 4]).astype(np.float32)
c = np.zeros([1, 4]).astype(np.float32)
y = gemm_reference_implementation(a, b, c, alpha=0.5)
expect(node, inputs=[a, b, c], outputs=[y], name="test_gemm_alpha")
```

</details>


<details>
<summary>beta</summary>

```python
node = onnx.helper.make_node(
    "Gemm", inputs=["a", "b", "c"], outputs=["y"], beta=0.5
)
a = np.random.ranf([2, 7]).astype(np.float32)
b = np.random.ranf([7, 4]).astype(np.float32)
c = np.random.ranf([1, 4]).astype(np.float32)
y = gemm_reference_implementation(a, b, c, beta=0.5)
expect(node, inputs=[a, b, c], outputs=[y], name="test_gemm_beta")
```

</details>


<details>
<summary>default_matrix_bias</summary>

```python
node = onnx.helper.make_node("Gemm", inputs=["a", "b", "c"], outputs=["y"])
a = np.random.ranf([3, 6]).astype(np.float32)
b = np.random.ranf([6, 4]).astype(np.float32)
c = np.random.ranf([3, 4]).astype(np.float32)
y = gemm_reference_implementation(a, b, c)
expect(
    node, inputs=[a, b, c], outputs=[y], name="test_gemm_default_matrix_bias"
)
```

</details>


<details>
<summary>default_no_bias</summary>

```python
node = onnx.helper.make_node("Gemm", inputs=["a", "b"], outputs=["y"])
a = np.random.ranf([2, 10]).astype(np.float32)
b = np.random.ranf([10, 3]).astype(np.float32)
y = gemm_reference_implementation(a, b)
expect(node, inputs=[a, b], outputs=[y], name="test_gemm_default_no_bias")
```

</details>


<details>
<summary>default_scalar_bias</summary>

```python
node = onnx.helper.make_node("Gemm", inputs=["a", "b", "c"], outputs=["y"])
a = np.random.ranf([2, 3]).astype(np.float32)
b = np.random.ranf([3, 4]).astype(np.float32)
c = np.array(3.14).astype(np.float32)
y = gemm_reference_implementation(a, b, c)
expect(
    node, inputs=[a, b, c], outputs=[y], name="test_gemm_default_scalar_bias"
)
```

</details>


<details>
<summary>default_single_elem_vector_bias</summary>

```python
node = onnx.helper.make_node("Gemm", inputs=["a", "b", "c"], outputs=["y"])
a = np.random.ranf([3, 7]).astype(np.float32)
b = np.random.ranf([7, 3]).astype(np.float32)
c = np.random.ranf([1]).astype(np.float32)
y = gemm_reference_implementation(a, b, c)
expect(
    node,
    inputs=[a, b, c],
    outputs=[y],
    name="test_gemm_default_single_elem_vector_bias",
)
```

</details>


<details>
<summary>default_vector_bias</summary>

```python
node = onnx.helper.make_node("Gemm", inputs=["a", "b", "c"], outputs=["y"])
a = np.random.ranf([2, 7]).astype(np.float32)
b = np.random.ranf([7, 4]).astype(np.float32)
c = np.random.ranf([1, 4]).astype(np.float32)
y = gemm_reference_implementation(a, b, c)
expect(
    node, inputs=[a, b, c], outputs=[y], name="test_gemm_default_vector_bias"
)
```

</details>


<details>
<summary>default_zero_bias</summary>

```python
node = onnx.helper.make_node("Gemm", inputs=["a", "b", "c"], outputs=["y"])
a = np.random.ranf([3, 5]).astype(np.float32)
b = np.random.ranf([5, 4]).astype(np.float32)
c = np.zeros([1, 4]).astype(np.float32)
y = gemm_reference_implementation(a, b, c)
expect(node, inputs=[a, b, c], outputs=[y], name="test_gemm_default_zero_bias")
```

</details>


<details>
<summary>transposeA</summary>

```python
node = onnx.helper.make_node(
    "Gemm", inputs=["a", "b", "c"], outputs=["y"], transA=1
)
a = np.random.ranf([6, 3]).astype(np.float32)
b = np.random.ranf([6, 4]).astype(np.float32)
c = np.zeros([1, 4]).astype(np.float32)
y = gemm_reference_implementation(a, b, c, transA=1)
expect(node, inputs=[a, b, c], outputs=[y], name="test_gemm_transposeA")
```

</details>


<details>
<summary>transposeB</summary>

```python
node = onnx.helper.make_node(
    "Gemm", inputs=["a", "b", "c"], outputs=["y"], transB=1
)
a = np.random.ranf([3, 6]).astype(np.float32)
b = np.random.ranf([4, 6]).astype(np.float32)
c = np.zeros([1, 4]).astype(np.float32)
y = gemm_reference_implementation(a, b, c, transB=1)
expect(node, inputs=[a, b, c], outputs=[y], name="test_gemm_transposeB")
```

</details>


### <a name="GlobalAveragePool"></a><a name="globalaveragepool">**GlobalAveragePool**</a>

  GlobalAveragePool consumes an input tensor X and applies average pooling across
   the values in the same channel. This is equivalent to AveragePool with kernel size
   equal to the spatial dimension of input tensor.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimensions are in the form of (N x C x D1 x D2 ... Dn), where N is the batch size.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>Output data tensor from pooling across the input tensor. The output tensor has the same rank as the input. The first two dimensions of output shape are the same as the input (N x C), while the other dimensions are all 1.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>globalaveragepool</summary>

```python
node = onnx.helper.make_node(
    "GlobalAveragePool",
    inputs=["x"],
    outputs=["y"],
)
x = np.random.randn(1, 3, 5, 5).astype(np.float32)
y = np.mean(x, axis=tuple(range(2, np.ndim(x))), keepdims=True)
expect(node, inputs=[x], outputs=[y], name="test_globalaveragepool")
```

</details>


<details>
<summary>globalaveragepool_precomputed</summary>

```python
node = onnx.helper.make_node(
    "GlobalAveragePool",
    inputs=["x"],
    outputs=["y"],
)
x = np.array(
    [
        [
            [
                [1, 2, 3],
                [4, 5, 6],
                [7, 8, 9],
            ]
        ]
    ]
).astype(np.float32)
y = np.array([[[[5]]]]).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_globalaveragepool_precomputed")
```

</details>


### <a name="GlobalLpPool"></a><a name="globallppool">**GlobalLpPool**</a>

  GlobalLpPool consumes an input tensor X and applies lp pool pooling across
   the values in the same channel. This is equivalent to LpPool with kernel size
   equal to the spatial dimension of input tensor.

#### Version

This version of the operator has been available since version 2 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#GlobalLpPool-1">1</a>

#### Attributes

<dl>
<dt><tt>p</tt> : int (default is 2)</dt>
<dd>p value of the Lp norm used to pool over the input data.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimensions are in the form of (N x C x D1 x D2 ... Dn), where N is the batch size.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>Output data tensor from pooling across the input tensor. The output tensor has the same rank as the input. The first two dimensions of output shape are the same as the input (N x C), while the other dimensions are all 1.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


### <a name="GlobalMaxPool"></a><a name="globalmaxpool">**GlobalMaxPool**</a>

  GlobalMaxPool consumes an input tensor X and applies max pooling across
   the values in the same channel. This is equivalent to MaxPool with kernel size
   equal to the spatial dimension of input tensor.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimensions are in the form of (N x C x D1 x D2 ... Dn), where N is the batch size.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>Output data tensor from pooling across the input tensor. The output tensor has the same rank as the input. The first two dimensions of output shape are the same as the input (N x C), while the other dimensions are all 1.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>globalmaxpool</summary>

```python
node = onnx.helper.make_node(
    "GlobalMaxPool",
    inputs=["x"],
    outputs=["y"],
)
x = np.random.randn(1, 3, 5, 5).astype(np.float32)
y = np.max(x, axis=tuple(range(2, np.ndim(x))), keepdims=True)
expect(node, inputs=[x], outputs=[y], name="test_globalmaxpool")
```

</details>


<details>
<summary>globalmaxpool_precomputed</summary>

```python
node = onnx.helper.make_node(
    "GlobalMaxPool",
    inputs=["x"],
    outputs=["y"],
)
x = np.array(
    [
        [
            [
                [1, 2, 3],
                [4, 5, 6],
                [7, 8, 9],
            ]
        ]
    ]
).astype(np.float32)
y = np.array([[[[9]]]]).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_globalmaxpool_precomputed")
```

</details>


### <a name="Greater"></a><a name="greater">**Greater**</a>

  Returns the tensor resulted from performing the `greater` logical operation
  elementwise on the input tensors `A` and `B` (with Numpy-style broadcasting support).

  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Greater-1">1</a>, <a href="Changelog.md#Greater-7">7</a>, <a href="Changelog.md#Greater-9">9</a>

#### Inputs

<dl>
<dt><tt>A</tt> (non-differentiable) : T</dt>
<dd>First input operand for the logical operator.</dd>
<dt><tt>B</tt> (non-differentiable) : T</dt>
<dd>Second input operand for the logical operator.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> (non-differentiable) : T1</dt>
<dd>Result tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input types to all numeric tensors.</dd>
<dt><tt>T1</tt> : tensor(bool)</dt>
<dd>Constrain output to boolean tensor.</dd>
</dl>


#### Examples

<details>
<summary>greater</summary>

```python
node = onnx.helper.make_node(
    "Greater",
    inputs=["x", "y"],
    outputs=["greater"],
)

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.randn(3, 4, 5).astype(np.float32)
z = np.greater(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_greater")
```

</details>


<details>
<summary>greater</summary>

```python
node = onnx.helper.make_node(
    "GreaterOrEqual",
    inputs=["x", "y"],
    outputs=["greater_equal"],
)

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.randn(3, 4, 5).astype(np.float32)
z = np.greater_equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_greater_equal")
```

</details>


<details>
<summary>greater_broadcast</summary>

```python
node = onnx.helper.make_node(
    "Greater",
    inputs=["x", "y"],
    outputs=["greater"],
)

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.randn(5).astype(np.float32)
z = np.greater(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_greater_bcast")
```

</details>


<details>
<summary>greater_broadcast</summary>

```python
node = onnx.helper.make_node(
    "GreaterOrEqual",
    inputs=["x", "y"],
    outputs=["greater_equal"],
)

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.randn(5).astype(np.float32)
z = np.greater_equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_greater_equal_bcast")
```

</details>


### <a name="GreaterOrEqual"></a><a name="greaterorequal">**GreaterOrEqual**</a>

  Returns the tensor resulted from performing the `greater_equal` logical operation
  elementwise on the input tensors `A` and `B` (with Numpy-style broadcasting support).

  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 16 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#GreaterOrEqual-12">12</a>

#### Inputs

<dl>
<dt><tt>A</tt> (non-differentiable) : T</dt>
<dd>First input operand for the logical operator.</dd>
<dt><tt>B</tt> (non-differentiable) : T</dt>
<dd>Second input operand for the logical operator.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> (non-differentiable) : T1</dt>
<dd>Result tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input types to all numeric tensors.</dd>
<dt><tt>T1</tt> : tensor(bool)</dt>
<dd>Constrain output to boolean tensor.</dd>
</dl>


### <a name="GridSample"></a><a name="gridsample">**GridSample**</a>

  Given an input `X` and a flow-field `grid`, computes the output `Y` using `X` values and pixel locations from the `grid`.
  For spatial input `X` with shape (N, C, H, W), the `grid` will have shape (N, H_out, W_out, 2),
  the output `Y` will have shape (N, C, H_out, W_out). For volumetric input `X` with shape (N, C, D, H, W),
  the `grid` will have shape (N, D_out, H_out, W_out, 3), the output `Y` will have shape (N, C, D_out, H_out, W_out).
  More generally, for an input `X` of rank r+2 with shape (N, C, d1, d2, ..., dr),
  the `grid` will have shape (N, D1_out, D2_out, ..., Dr_out, r), the output `Y` will have shape (N, C, D1_out, D2_out, ..., Dr_out).

  The tensor `X` contains values at centers of square pixels (voxels, etc) locations such as (n, c, d1_in, d2_in, ..., dr_in).
  The (n, d1_out, d2_out, ..., dr_out, :) values from the tensor `grid` are the normalized positions for interpolating the values
  at the (n, c, d1_out, d2_out, ..., dr_out) locations from the output tensor `Y` using a specified interpolation method (the mode)
  and a padding mode (for `grid` positions falling outside the 2-dimensional image).

  For example, the values in `grid[n, h_out, w_out, :]` are size-2 vectors specifying normalized positions in the 2-dimensional space of `X`.
  They are used to interpolate output values of `Y[n, c, h_out, w_out]`.

  The GridSample operator is often used in doing grid generator and sampler in the
  [Spatial Transformer Networks](https://arxiv.org/abs/1506.02025).
  See also in [torch.nn.functional.grid_sample](https://pytorch.org/docs/stable/generated/torch.nn.functional.grid_sample.html).

#### Version

This version of the operator has been available since version 20 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#GridSample-16">16</a>

#### Attributes

<dl>
<dt><tt>align_corners</tt> : int (default is 0)</dt>
<dd>If align_corners=1, the extrema (-1 and 1) are considered as referring to the center points of the input's corner pixels (voxels, etc.). If align_corners=0, they are instead considered as referring to the corner points of the input's corner pixels (voxels, etc.), making the sampling more resolution agnostic.</dd>
<dt><tt>mode</tt> : string (default is linear)</dt>
<dd>Three interpolation modes: linear (default), nearest and cubic. The "linear" mode includes linear and N-linear interpolation modes depending on the number of spatial dimensions of the input tensor (i.e. linear for 1 spatial dimension, bilinear for 2 spatial dimensions, etc.). The "cubic" mode also includes N-cubic interpolation modes following the same rules. The "nearest" mode rounds to the nearest even index when the sampling point falls halfway between two indices.</dd>
<dt><tt>padding_mode</tt> : string (default is zeros)</dt>
<dd>Support padding modes for outside grid values: `zeros`(default), `border`, `reflection`. zeros: use 0 for out-of-bound grid locations, border: use border values for out-of-bound grid locations, reflection: use values at locations reflected by the border for out-of-bound grid locations. If index 0 represents the margin pixel, the reflected value at index -1 will be the same as the value at index 1. For location far away from the border, it will keep being reflected until becoming in bound. If pixel location x = -3.5 reflects by border -1 and becomes x' = 1.5, then reflects by border 1 and becomes x'' = 0.5.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T1</dt>
<dd>Input tensor of rank r+2 that has shape (N, C, D1, D2, ..., Dr), where N is the batch size, C is the number of channels, D1, D2, ..., Dr are the spatial dimensions.</dd>
<dt><tt>grid</tt> (non-differentiable) : T2</dt>
<dd>Input offset of shape (N, D1_out, D2_out, ..., Dr_out, r), where D1_out, D2_out, ..., Dr_out are the spatial dimensions of the grid and output, and r is the number of spatial dimensions. Grid specifies the sampling locations normalized by the input spatial dimensions. Therefore, it should have most values in the range of [-1, 1]. If the grid has values outside the range of [-1, 1], the corresponding outputs will be handled as defined by padding_mode. Following computer vision convention, the coordinates in the length-r location vector are listed from the innermost tensor dimension to the outermost, the opposite of regular tensor indexing.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T1</dt>
<dd>Output tensor of rank r+2 that has shape (N, C, D1_out, D2_out, ..., Dr_out) of the sampled values. For integer input types, intermediate values are computed as floating point and cast to integer at the end.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain input `X` and output `Y` types to all tensor types.</dd>
<dt><tt>T2</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain grid types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>gridsample</summary>

```python
node = onnx.helper.make_node(
    "GridSample",
    inputs=["X", "Grid"],
    outputs=["Y"],
    mode="linear",
    padding_mode="zeros",
    align_corners=0,
)
# X shape, [N, C, H, W] - [1, 1, 4, 4]
X = np.array(
    [
        [
            [
                [0.0, 1.0, 2.0, 3.0],
                [4.0, 5.0, 6.0, 7.0],
                [8.0, 9.0, 10.0, 11.0],
                [12.0, 13.0, 14.0, 15.0],
            ]
        ]
    ],
    dtype=np.float32,
)
# Grid shape, [N, H_out, W_out, 2] - [1, 6, 6, 2]
Grid = np.array(
    [
        [
            [
                [-1.0000, -1.0000],
                [-0.6000, -1.0000],
                [-0.2000, -1.0000],
                [0.2000, -1.0000],
                [0.6000, -1.0000],
                [1.0000, -1.0000],
            ],
            [
                [-1.0000, -0.6000],
                [-0.6000, -0.6000],
                [-0.2000, -0.6000],
                [0.2000, -0.6000],
                [0.6000, -0.6000],
                [1.0000, -0.6000],
            ],
            [
                [-1.0000, -0.2000],
                [-0.6000, -0.2000],
                [-0.2000, -0.2000],
                [0.2000, -0.2000],
                [0.6000, -0.2000],
                [1.0000, -0.2000],
            ],
            [
                [-1.0000, 0.2000],
                [-0.6000, 0.2000],
                [-0.2000, 0.2000],
                [0.2000, 0.2000],
                [0.6000, 0.2000],
                [1.0000, 0.2000],
            ],
            [
                [-1.0000, 0.6000],
                [-0.6000, 0.6000],
                [-0.2000, 0.6000],
                [0.2000, 0.6000],
                [0.6000, 0.6000],
                [1.0000, 0.6000],
            ],
            [
                [-1.0000, 1.0000],
                [-0.6000, 1.0000],
                [-0.2000, 1.0000],
                [0.2000, 1.0000],
                [0.6000, 1.0000],
                [1.0000, 1.0000],
            ],
        ]
    ],
    dtype=np.float32,
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 6, 6]
Y = np.array(
    [
        [
            [
                [0.0000, 0.1500, 0.5500, 0.9500, 1.3500, 0.7500],
                [0.6000, 1.5000, 2.3000, 3.1000, 3.9000, 2.1000],
                [2.2000, 4.7000, 5.5000, 6.3000, 7.1000, 3.7000],
                [3.8000, 7.9000, 8.7000, 9.5000, 10.3000, 5.3000],
                [5.4000, 11.1000, 11.9000, 12.7000, 13.5000, 6.9000],
                [3.0000, 6.1500, 6.5500, 6.9500, 7.3500, 3.7500],
            ]
        ]
    ],
    dtype=np.float32,
)
expect(node, inputs=[X, Grid], outputs=[Y], name="test_gridsample")
```

</details>


<details>
<summary>gridsample_mode_aligncorners</summary>

```python
# X shape, [N, C, H, W] - [1, 1, 3, 2]
X = np.array(
    [[[[0.0, 1.0], [2.0, 3.0], [4.0, 5.0]]]],
    dtype=np.float32,
)
# Grid shape, [N, H_out, W_out, 2] - [1, 2, 4, 2]
Grid = np.array(
    [
        [
            [
                [-1.0000, -1.0000],
                [-0.5000, -0.5000],
                [-0.2000, -0.2000],
                [0.0000, 0.0000],
            ],
            [
                [0.0000, 0.0000],
                [-0.2000, -0.2000],
                [0.5000, 0.5000],
                [1.0000, 1.0000],
            ],
        ]
    ],
    dtype=np.float32,
)

# setting mode = 'bilinear', default align_corners = 0
node = onnx.helper.make_node(
    "GridSample",
    inputs=["X", "Grid"],
    outputs=["Y"],
    mode="linear",
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_bilinear = np.array(
    [[[[0.0000, 0.5000, 1.7000, 2.5000], [2.5000, 1.7000, 4.5000, 1.2500]]]],
    dtype=np.float32,
)

expect(
    node,
    inputs=[X, Grid],
    outputs=[Y_bilinear],
    name="test_gridsample_bilinear",
)

# setting mode = 'bilinear', align_corners = 1
node = onnx.helper.make_node(
    "GridSample",
    inputs=["X", "Grid"],
    outputs=["Y"],
    mode="linear",
    align_corners=1,
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_align_corners = np.array(
    [[[[0.0000, 1.2500, 2.0000, 2.5000], [2.5000, 2.0000, 3.7500, 5.0000]]]],
    dtype=np.float32,
)

expect(
    node,
    inputs=[X, Grid],
    outputs=[Y_align_corners],
    name="test_gridsample_aligncorners_true",
)

# setting mode = 'nearest'
node = onnx.helper.make_node(
    "GridSample",
    inputs=["X", "Grid"],
    outputs=["Y"],
    mode="nearest",
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_nearest = np.array(
    [[[[0.0, 0.0, 2.0, 2.0], [2.0, 2.0, 5.0, 0.0]]]],
    dtype=np.float32,
)

expect(
    node, inputs=[X, Grid], outputs=[Y_nearest], name="test_gridsample_nearest"
)

# setting mode = 'bicubic'
node = onnx.helper.make_node(
    "GridSample",
    inputs=["X", "Grid"],
    outputs=["Y"],
    mode="cubic",
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_bicubic = np.array(
    [[[[-0.1406, 0.3828, 1.7556, 2.9688], [2.9688, 1.7556, 5.1445, 1.3906]]]],
    dtype=np.float32,
)

expect(
    node, inputs=[X, Grid], outputs=[Y_bicubic], name="test_gridsample_bicubic"
)

# ============================================================================
# Additional tests
# The reference output tensors were generated using PyTorch 2.0.
Grid = np.array(
    [
        [
            [[-1.0, -0.8], [-0.6, -0.5], [-0.1, -0.2], [0.7, 0.0]],
            [[0.0, 0.4], [0.2, -0.2], [-0.3, 0.5], [-1.0, 1.0]],
        ]
    ],
    dtype=np.float32,
)

node = onnx.helper.make_node(
    "GridSample",
    inputs=["X", "Grid"],
    outputs=["Y"],
    mode="nearest",
    align_corners=0,
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_nearest = np.array(
    [[[[0.0, 0.0, 2.0, 3.0], [4.0, 3.0, 4.0, 4.0]]]],
    dtype=np.float32,
)

expect(
    node,
    inputs=[X, Grid],
    outputs=[Y_nearest],
    name="test_gridsample_nearest_align_corners_0_additional_1",
)

# setting mode = 'nearest'
node = onnx.helper.make_node(
    "GridSample",
    inputs=["X", "Grid"],
    outputs=["Y"],
    mode="nearest",
    align_corners=1,
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_nearest = np.array(
    [[[[0.0, 0.0, 2.0, 3.0], [2.0, 3.0, 4.0, 4.0]]]],
    dtype=np.float32,
)

expect(
    node,
    inputs=[X, Grid],
    outputs=[Y_nearest],
    name="test_gridsample_nearest_align_corners_1_additional_1",
)

node = onnx.helper.make_node(
    "GridSample",
    inputs=["X", "Grid"],
    outputs=["Y"],
    mode="linear",
    align_corners=0,
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_bilinear = np.array(
    [[[[0.0000, 0.4500, 1.8000, 2.4000], [3.7000, 2.1000, 3.7000, 1.0000]]]],
    dtype=np.float32,
)

expect(
    node,
    inputs=[X, Grid],
    outputs=[Y_bilinear],
    name="test_gridsample_bilinear_align_corners_0_additional_1",
)

node = onnx.helper.make_node(
    "GridSample",
    inputs=["X", "Grid"],
    outputs=["Y"],
    mode="linear",
    align_corners=1,
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_bilinear = np.array(
    [[[[0.4000, 1.2000, 2.0500, 2.8500], [3.3000, 2.2000, 3.3500, 4.0000]]]],
    dtype=np.float32,
)

expect(
    node,
    inputs=[X, Grid],
    outputs=[Y_bilinear],
    name="test_gridsample_bilinear_align_corners_1_additional_1",
)

# These two new bicubic tests produces slightly higher error ~5e-5
node = onnx.helper.make_node(
    "GridSample",
    inputs=["X", "Grid"],
    outputs=["Y"],
    mode="cubic",
    align_corners=0,
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_bicubic = np.array(
    [
        [
            [
                [-0.173250, 0.284265, 1.923106, 2.568000],
                [5.170375, 2.284414, 4.744844, 1.046875],
            ]
        ]
    ],
    dtype=np.float32,
)

expect(
    node,
    inputs=[X, Grid],
    outputs=[Y_bicubic],
    name="test_gridsample_bicubic_align_corners_0_additional_1",
)

node = onnx.helper.make_node(
    "GridSample",
    inputs=["X", "Grid"],
    outputs=["Y"],
    mode="cubic",
    align_corners=1,
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_bicubic = np.array(
    [
        [
            [
                [0.304001, 1.128750, 2.266270, 3.144844],
                [4.531500, 2.455360, 4.599819, 4.000000],
            ]
        ]
    ],
    dtype=np.float32,
)

expect(
    node,
    inputs=[X, Grid],
    outputs=[Y_bicubic],
    name="test_gridsample_bicubic_align_corners_1_additional_1",
)
```

</details>


<details>
<summary>gridsample_paddingmode</summary>

```python
# X shape, [N, C, H, W] - [1, 1, 3, 2]
X = np.array(
    [[[[0.0, 1.0], [2.0, 3.0], [4.0, 5.0]]]],
    dtype=np.float32,
)
# Grid shape, [N, H_out, W_out, 2] - [1, 2, 4, 2]
Grid = np.array(
    [
        [
            [
                [-10.0000, -10.0000],
                [-5.0000, -5.0000],
                [-0.2000, -0.2000],
                [10.0000, 10.0000],
            ],
            [
                [10.0000, 10.0000],
                [-0.2000, -0.2000],
                [5.0000, 5.0000],
                [10.0000, 10.0000],
            ],
        ]
    ],
    dtype=np.float32,
)

# setting padding_mode = 'zeros'
node = onnx.helper.make_node(
    "GridSample",
    inputs=["X", "Grid"],
    outputs=["Y"],
    padding_mode="zeros",
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_zeros = np.array(
    [[[[0.0000, 0.0000, 1.7000, 0.0000], [0.0000, 1.7000, 0.0000, 0.0000]]]],
    dtype=np.float32,
)

expect(
    node,
    inputs=[X, Grid],
    outputs=[Y_zeros],
    name="test_gridsample_zeros_padding",
)

# setting padding_mode = 'border'
node = onnx.helper.make_node(
    "GridSample",
    inputs=["X", "Grid"],
    outputs=["Y"],
    padding_mode="border",
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_border = np.array(
    [[[[0.0000, 0.0000, 1.7000, 5.0000], [5.0000, 1.7000, 5.0000, 5.0000]]]],
    dtype=np.float32,
)

expect(
    node,
    inputs=[X, Grid],
    outputs=[Y_border],
    name="test_gridsample_border_padding",
)

# setting padding_mode = 'reflection'
node = onnx.helper.make_node(
    "GridSample",
    inputs=["X", "Grid"],
    outputs=["Y"],
    padding_mode="reflection",
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_reflection = np.array(
    [[[[2.5000, 0.0000, 1.7000, 2.5000], [2.5000, 1.7000, 5.0000, 2.5000]]]],
    dtype=np.float32,
)

expect(
    node,
    inputs=[X, Grid],
    outputs=[Y_reflection],
    name="test_gridsample_reflection_padding",
)
```

</details>


<details>
<summary>volumeetric_gridsample_mode_aligncorners</summary>

```python
X = np.array(
    [
        [
            [
                [[1.0, 2.0], [3.0, 4.0]],
                [[5.0, 6.0], [7.0, 8.0]],
                [[9.0, 10.0], [11.0, 12.0]],
            ]
        ]
    ],
    dtype=np.float32,
)

Grid = np.array(
    [
        [
            [
                [[-1.0, -1.0, -1.0], [-1.0, -0.5, 0.3]],
                [[-0.5, -0.5, -0.5], [1.0, -0.6, -1.0]],
                [[-0.2, -0.2, -0.2], [0.4, 0.2, 0.6]],
                [[0.0, 0.0, 0.0], [-1.0, 0.0, 0.0]],
            ],
            [
                [[0.0, 0.0, 0.0], [-1.0, 1.0, 0.0]],
                [[-0.2, -0.2, -0.2], [1.0, 0.4, -0.2]],
                [[0.5, 0.5, 0.5], [-1.0, -0.8, 0.8]],
                [[1.0, 1.0, 1.0], [0.4, 0.6, -0.3]],
            ],
        ]
    ],
    dtype=np.float32,
)

node = onnx.helper.make_node(
    "GridSample",
    inputs=["X", "Grid"],
    outputs=["Y"],
    mode="nearest",
    align_corners=0,
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_nearest = np.array(
    [
        [
            [
                [[1.0, 5.0], [1.0, 0.0], [5.0, 12.0], [5.0, 5.0]],
                [[5.0, 0.0], [5.0, 0.0], [12.0, 9.0], [0.0, 8.0]],
            ]
        ]
    ],
    dtype=np.float32,
)

expect(
    node,
    inputs=[X, Grid],
    outputs=[Y_nearest],
    name="test_gridsample_volumetric_nearest_align_corners_0",
)

node = onnx.helper.make_node(
    "GridSample",
    inputs=["X", "Grid"],
    outputs=["Y"],
    mode="nearest",
    align_corners=1,
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_nearest = np.array(
    [
        [
            [
                [[1.0, 5.0], [1.0, 2.0], [5.0, 12.0], [5.0, 5.0]],
                [[5.0, 7.0], [5.0, 8.0], [12.0, 9.0], [12.0, 8.0]],
            ]
        ]
    ],
    dtype=np.float32,
)

expect(
    node,
    inputs=[X, Grid],
    outputs=[Y_nearest],
    name="test_gridsample_volumetric_nearest_align_corners_1",
)

node = onnx.helper.make_node(
    "GridSample",
    inputs=["X", "Grid"],
    outputs=["Y"],
    mode="linear",
    align_corners=0,
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_bilinear = np.array(
    [
        [
            [
                [
                    [0.1250, 3.4000],
                    [2.0000, 0.4500],
                    [4.7000, 10.9000],
                    [6.5000, 3.0000],
                ],
                [
                    [6.5000, 1.7500],
                    [4.7000, 3.3000],
                    [11.0000, 2.5200],
                    [1.5000, 5.4900],
                ],
            ]
        ]
    ],
    dtype=np.float32,
)

expect(
    node,
    inputs=[X, Grid],
    outputs=[Y_bilinear],
    name="test_gridsample_volumetric_bilinear_align_corners_0",
)

node = onnx.helper.make_node(
    "GridSample",
    inputs=["X", "Grid"],
    outputs=["Y"],
    mode="linear",
    align_corners=1,
)
# Y shape, [N, C, H_out, W_out] - [1, 1, 2, 4]
Y_bilinear = np.array(
    [
        [
            [
                [
                    [1.0000, 6.7000],
                    [3.7500, 2.4000],
                    [5.4000, 9.3000],
                    [6.5000, 6.0000],
                ],
                [
                    [6.5000, 7.0000],
                    [5.4000, 6.6000],
                    [9.2500, 8.4000],
                    [12.0000, 6.1000],
                ],
            ]
        ]
    ],
    dtype=np.float32,
)

expect(
    node,
    inputs=[X, Grid],
    outputs=[Y_bilinear],
    name="test_gridsample_volumetric_bilinear_align_corners_1",
)
```

</details>


### <a name="GroupNormalization"></a><a name="groupnormalization">**GroupNormalization**</a>

  A GroupNormalization function. Carries out group normalization as described in
  the paper https://arxiv.org/abs/1803.08494

  This operator transforms input according to
  ```
  y = scale * (x - mean) / sqrt(variance + epsilon) + bias,
  ```
  where the mean and variance are computed per instance per group of channels, and
  `scale` and `bias` should be specified for each group of channels. The number of
  groups `num_groups` should be divisible by the number of channels so that there are
  an equal number of channels per group.

  The overall computation has two stages: the first stage normalizes the elements to
  have zero mean and unit variance for each instance in each group, and the second
  stage scales and shifts the results of the first stage. The floating-point precision
  used in the first stage is determined by the `stash_type` attribute. For example,
  if `stash_type` is 1, the operator casts all input variables to 32-bit float,
  performs the computation, and finally casts the normalized results back to the
  original type of `X`. The second stage does not depend on `stash_type`.

  When the number of groups is the same as the number of channels, this operator is
  equivalent to InstanceNormalization. When there is only one group, this operator
  is equivalent to LayerNormalization.

#### Version

This version of the operator has been available since version 21 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#GroupNormalization-18">18</a>

#### Attributes

<dl>
<dt><tt>epsilon</tt> : float (default is 1e-05)</dt>
<dd>The epsilon value to use to avoid division by zero.</dd>
<dt><tt>num_groups</tt> : int (required)</dt>
<dd>The number of groups of channels. It should be a divisor of the number of channels `C`.</dd>
<dt><tt>stash_type</tt> : int (default is 1)</dt>
<dd>The floating-point precision used in stage one of the computation.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input data tensor. Dimensions for image cases are `(N x C x H x W)`, where `N` is the batch size, `C` is the number of channels, and `H` and `W` are the height and width of the data. Statistics are computed for every group of channels over `C`, `H`, and `W`. For non-image cases, the dimensions are in the form of `(N x C x D1 x D2 ... Dn)`.</dd>
<dt><tt>scale</tt> (differentiable) : T</dt>
<dd>Scale tensor of shape `(C)`.</dd>
<dt><tt>bias</tt> (differentiable) : T</dt>
<dd>Bias tensor of shape `(C)`.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>The output tensor of the same shape as `X`.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(bfloat16), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>epsilon</summary>

```python
c = 4
num_groups = 2
x = np.random.randn(3, c, 2, 2).astype(np.float32)
scale = np.random.randn(c).astype(np.float32)
bias = np.random.randn(c).astype(np.float32)
epsilon = 1e-2
y = _group_normalization(x, num_groups, scale, bias, epsilon).astype(np.float32)

node = onnx.helper.make_node(
    "GroupNormalization",
    inputs=["x", "scale", "bias"],
    outputs=["y"],
    epsilon=epsilon,
    num_groups=num_groups,
)

expect(
    node,
    inputs=[x, scale, bias],
    outputs=[y],
    name="test_group_normalization_epsilon",
)
```

</details>


<details>
<summary>groupnormalization</summary>

```python
c = 4
num_groups = 2
x = np.random.randn(3, c, 2, 2).astype(np.float32)
scale = np.random.randn(c).astype(np.float32)
bias = np.random.randn(c).astype(np.float32)
y = _group_normalization(x, num_groups, scale, bias).astype(np.float32)

node = onnx.helper.make_node(
    "GroupNormalization",
    inputs=["x", "scale", "bias"],
    outputs=["y"],
    num_groups=num_groups,
)

expect(
    node,
    inputs=[x, scale, bias],
    outputs=[y],
    name="test_group_normalization_example",
)
```

</details>


### <a name="HammingWindow"></a><a name="hammingwindow">**HammingWindow**</a>

  Generates a Hamming window as described in the paper https://ieeexplore.ieee.org/document/1455106.

#### Version

This version of the operator has been available since version 17 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>output_datatype</tt> : int (default is 1)</dt>
<dd>The data type of the output tensor. Strictly must be one of the values from DataType enum in TensorProto whose values correspond to T2. The default value is 1 = FLOAT. </dd>
<dt><tt>periodic</tt> : int (default is 1)</dt>
<dd>If 1, returns a window to be used as periodic function. If 0, return a symmetric window. When 'periodic' is specified, hann computes a window of length size + 1 and returns the first size points. The default value is 1. </dd>
</dl>

#### Inputs

<dl>
<dt><tt>size</tt> (non-differentiable) : T1</dt>
<dd>A scalar value indicating the length of the window.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (non-differentiable) : T2</dt>
<dd>A Hamming window with length: size. The output has the shape: [size].</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(int32), tensor(int64)</dt>
<dd>Constrain the input size to int64_t.</dd>
<dt><tt>T2</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain output types to numeric tensors.</dd>
</dl>


#### Examples

<details>
<summary>hammingwindow</summary>

```python
# Test periodic window
node = onnx.helper.make_node(
    "HammingWindow",
    inputs=["x"],
    outputs=["y"],
)
size = np.int32(10)
a0 = 25 / 46
a1 = 1 - a0
y = a0 - a1 * np.cos(2 * np.pi * np.arange(0, size, 1, dtype=np.float32) / size)
expect(node, inputs=[size], outputs=[y], name="test_hammingwindow")

# Test symmetric window
node = onnx.helper.make_node(
    "HammingWindow", inputs=["x"], outputs=["y"], periodic=0
)
size = np.int32(10)
a0 = 25 / 46
a1 = 1 - a0
y = a0 - a1 * np.cos(
    2 * np.pi * np.arange(0, size, 1, dtype=np.float32) / (size - 1)
)
expect(node, inputs=[size], outputs=[y], name="test_hammingwindow_symmetric")
```

</details>


### <a name="HannWindow"></a><a name="hannwindow">**HannWindow**</a>

  Generates a Hann window as described in the paper https://ieeexplore.ieee.org/document/1455106.

#### Version

This version of the operator has been available since version 17 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>output_datatype</tt> : int (default is 1)</dt>
<dd>The data type of the output tensor. Strictly must be one of the values from DataType enum in TensorProto whose values correspond to T2. The default value is 1 = FLOAT. </dd>
<dt><tt>periodic</tt> : int (default is 1)</dt>
<dd>If 1, returns a window to be used as periodic function. If 0, return a symmetric window. When 'periodic' is specified, hann computes a window of length size + 1 and returns the first size points. The default value is 1. </dd>
</dl>

#### Inputs

<dl>
<dt><tt>size</tt> (non-differentiable) : T1</dt>
<dd>A scalar value indicating the length of the window.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (non-differentiable) : T2</dt>
<dd>A Hann window with length: size. The output has the shape: [size].</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(int32), tensor(int64)</dt>
<dd>Constrain the input size to int64_t.</dd>
<dt><tt>T2</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain output types to numeric tensors.</dd>
</dl>


#### Examples

<details>
<summary>hannwindow</summary>

```python
# Test periodic window
node = onnx.helper.make_node(
    "HannWindow",
    inputs=["x"],
    outputs=["y"],
)
size = np.int32(10)
a0 = 0.5
a1 = 0.5
y = a0 - a1 * np.cos(2 * np.pi * np.arange(0, size, 1, dtype=np.float32) / size)
expect(node, inputs=[size], outputs=[y], name="test_hannwindow")

# Test symmetric window
node = onnx.helper.make_node(
    "HannWindow", inputs=["x"], outputs=["y"], periodic=0
)
size = np.int32(10)
a0 = 0.5
a1 = 0.5
y = a0 - a1 * np.cos(
    2 * np.pi * np.arange(0, size, 1, dtype=np.float32) / (size - 1)
)
expect(node, inputs=[size], outputs=[y], name="test_hannwindow_symmetric")
```

</details>


### <a name="HardSigmoid"></a><a name="hardsigmoid">**HardSigmoid**</a>

  HardSigmoid takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the HardSigmoid function, y = max(0, min(1, alpha * x + beta)),
  is applied to the tensor elementwise.

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#HardSigmoid-1">1</a>

#### Attributes

<dl>
<dt><tt>alpha</tt> : float (default is 0.2)</dt>
<dd>Value of alpha.</dd>
<dt><tt>beta</tt> : float (default is 0.5)</dt>
<dd>Value of beta.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>hardsigmoid</summary>

```python
node = onnx.helper.make_node(
    "HardSigmoid", inputs=["x"], outputs=["y"], alpha=0.5, beta=0.6
)

x = np.array([-1, 0, 1]).astype(np.float32)
y = np.clip(x * 0.5 + 0.6, 0, 1)  # expected output [0.1, 0.6, 1.]
expect(node, inputs=[x], outputs=[y], name="test_hardsigmoid_example")

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.clip(x * 0.5 + 0.6, 0, 1)
expect(node, inputs=[x], outputs=[y], name="test_hardsigmoid")
```

</details>


<details>
<summary>hardsigmoid_default</summary>

```python
default_alpha = 0.2
default_beta = 0.5
node = onnx.helper.make_node(
    "HardSigmoid",
    inputs=["x"],
    outputs=["y"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.clip(x * default_alpha + default_beta, 0, 1)
expect(node, inputs=[x], outputs=[y], name="test_hardsigmoid_default")
```

</details>


### <a name="HardSwish"></a><a name="hardswish">**HardSwish**</a>

  HardSwish takes one input data (Tensor<T>) and produces one output data (Tensor<T>) where
  the HardSwish function, y = x * max(0, min(1, alpha * x + beta)) = x * HardSigmoid<alpha, beta>(x),
  where alpha = 1/6 and beta = 0.5, is applied to the tensor elementwise.

#### Version

This version of the operator has been available since version 14 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>hardswish</summary>

```python
node = onnx.helper.make_node(
    "HardSwish",
    inputs=["x"],
    outputs=["y"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = hardswish(x)

expect(node, inputs=[x], outputs=[y], name="test_hardswish")
```

</details>


### <a name="Hardmax"></a><a name="hardmax">**Hardmax**</a>

  The operator computes the hardmax values for the given input:

   Hardmax(element in input, axis) = 1 if the element is the first maximum value along the specified axis, 0 otherwise

  The "axis" attribute indicates the dimension along which Hardmax
  will be performed. The output tensor has the same shape
  and contains the Hardmax values of the corresponding input.

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Hardmax-1">1</a>, <a href="Changelog.md#Hardmax-11">11</a>

#### Attributes

<dl>
<dt><tt>axis</tt> : int (default is -1)</dt>
<dd>
Describes the dimension Hardmax will be performed on.
Negative value means counting dimensions
from the back. Accepted range is [-r, r-1] where r = rank(input).
</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>The input tensor of rank >= axis.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>The output values with the same shape as the input tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>hardmax</summary>

```python
node = onnx.helper.make_node(
    "Hardmax",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([[3, 0, 1, 2], [2, 5, 1, 0], [0, 1, 3, 2], [0, 1, 2, 3]]).astype(
    np.float32
)
# expect result:
# [[1. 0. 0. 0.]
# [0. 1. 0. 0.]
# [0. 0. 1. 0.]
# [0. 0. 0. 1.]]
y = hardmax(x)
expect(node, inputs=[x], outputs=[y], name="test_hardmax_example")

# For multiple occurrences of the maximal values, the first occurrence is selected for one-hot output
x = np.array([[3, 3, 3, 1]]).astype(np.float32)
# expect result:
# [[1, 0, 0, 0]]
y = hardmax(x)
expect(node, inputs=[x], outputs=[y], name="test_hardmax_one_hot")
```

</details>


<details>
<summary>hardmax_axis</summary>

```python
x = np.random.randn(3, 4, 5).astype(np.float32)
node = onnx.helper.make_node(
    "Hardmax",
    inputs=["x"],
    outputs=["y"],
    axis=0,
)
y = hardmax(x, axis=0)
expect(node, inputs=[x], outputs=[y], name="test_hardmax_axis_0")

node = onnx.helper.make_node(
    "Hardmax",
    inputs=["x"],
    outputs=["y"],
    axis=1,
)
y = hardmax(x, axis=1)
expect(node, inputs=[x], outputs=[y], name="test_hardmax_axis_1")

node = onnx.helper.make_node(
    "Hardmax",
    inputs=["x"],
    outputs=["y"],
    axis=2,
)
y = hardmax(x, axis=2)
expect(node, inputs=[x], outputs=[y], name="test_hardmax_axis_2")

node = onnx.helper.make_node(
    "Hardmax",
    inputs=["x"],
    outputs=["y"],
    axis=-1,
)
y = hardmax(x, axis=-1)
expect(node, inputs=[x], outputs=[y], name="test_hardmax_negative_axis")

# default axis is -1
node = onnx.helper.make_node(
    "Hardmax",
    inputs=["x"],
    outputs=["y"],
)
expect(node, inputs=[x], outputs=[y], name="test_hardmax_default_axis")
```

</details>


### <a name="Identity"></a><a name="identity">**Identity**</a>

  Identity operator

#### Version

This version of the operator has been available since version 21 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Identity-1">1</a>, <a href="Changelog.md#Identity-13">13</a>, <a href="Changelog.md#Identity-14">14</a>, <a href="Changelog.md#Identity-16">16</a>, <a href="Changelog.md#Identity-19">19</a>

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : V</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : V</dt>
<dd>Tensor to copy input into.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>V</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4), seq(tensor(uint8)), seq(tensor(uint16)), seq(tensor(uint32)), seq(tensor(uint64)), seq(tensor(int8)), seq(tensor(int16)), seq(tensor(int32)), seq(tensor(int64)), seq(tensor(float16)), seq(tensor(float)), seq(tensor(double)), seq(tensor(string)), seq(tensor(bool)), seq(tensor(complex64)), seq(tensor(complex128)), optional(seq(tensor(uint8))), optional(seq(tensor(uint16))), optional(seq(tensor(uint32))), optional(seq(tensor(uint64))), optional(seq(tensor(int8))), optional(seq(tensor(int16))), optional(seq(tensor(int32))), optional(seq(tensor(int64))), optional(seq(tensor(float16))), optional(seq(tensor(float))), optional(seq(tensor(double))), optional(seq(tensor(string))), optional(seq(tensor(bool))), optional(seq(tensor(complex64))), optional(seq(tensor(complex128))), optional(tensor(uint8)), optional(tensor(uint16)), optional(tensor(uint32)), optional(tensor(uint64)), optional(tensor(int8)), optional(tensor(int16)), optional(tensor(int32)), optional(tensor(int64)), optional(tensor(float16)), optional(tensor(float)), optional(tensor(double)), optional(tensor(string)), optional(tensor(bool)), optional(tensor(complex64)), optional(tensor(complex128))</dt>
<dd>Constrain input and output types to all tensor, sequence, and optional types.</dd>
</dl>


#### Examples

<details>
<summary>identity</summary>

```python
node = onnx.helper.make_node(
    "Identity",
    inputs=["x"],
    outputs=["y"],
)

data = np.array(
    [
        [
            [
                [1, 2],
                [3, 4],
            ]
        ]
    ],
    dtype=np.float32,
)

expect(node, inputs=[data], outputs=[data], name="test_identity")
```

</details>


<details>
<summary>identity_opt</summary>

```python
ten_in_tp = onnx.helper.make_tensor_type_proto(
    onnx.TensorProto.FLOAT, shape=[5]
)
seq_in_tp = onnx.helper.make_sequence_type_proto(ten_in_tp)
opt_in_tp = onnx.helper.make_optional_type_proto(seq_in_tp)

identity_node = onnx.helper.make_node(
    "Identity", inputs=["opt_in"], outputs=["opt_out"]
)

x = [np.array([1, 2, 3, 4, 5]).astype(np.float32)]

expect(
    identity_node,
    inputs=[x],
    outputs=[x],
    name="test_identity_opt",
    opset_imports=[onnx.helper.make_opsetid("", 16)],
    input_type_protos=[opt_in_tp],
    output_type_protos=[opt_in_tp],
)
```

</details>


<details>
<summary>sequence</summary>

```python
node = onnx.helper.make_node(
    "Identity",
    inputs=["x"],
    outputs=["y"],
)

data = [
    np.array(
        [
            [
                [
                    [1, 2],
                    [3, 4],
                ]
            ]
        ],
        dtype=np.float32,
    ),
    np.array(
        [
            [
                [
                    [2, 3],
                    [1, 5],
                ]
            ]
        ],
        dtype=np.float32,
    ),
]

expect(node, inputs=[data], outputs=[data], name="test_identity_sequence")
```

</details>


### <a name="If"></a><a name="if">**If**</a>

  If conditional

#### Version

This version of the operator has been available since version 21 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#If-1">1</a>, <a href="Changelog.md#If-11">11</a>, <a href="Changelog.md#If-13">13</a>, <a href="Changelog.md#If-16">16</a>, <a href="Changelog.md#If-19">19</a>

#### Attributes

<dl>
<dt><tt>else_branch</tt> : graph (required)</dt>
<dd>Graph to run if condition is false. Has N outputs: values you wish to be live-out to the enclosing scope. The number of outputs must match the number of outputs in the then_branch.</dd>
<dt><tt>then_branch</tt> : graph (required)</dt>
<dd>Graph to run if condition is true. Has N outputs: values you wish to be live-out to the enclosing scope. The number of outputs must match the number of outputs in the else_branch.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>cond</tt> : B</dt>
<dd>Condition for the if. The tensor must contain a single element.</dd>
</dl>

#### Outputs (1 - &#8734;)

<dl>
<dt><tt>outputs</tt> (variadic, heterogeneous) : V</dt>
<dd>Values that are live-out to the enclosing scope. The return values in the `then_branch` and `else_branch` must be of the same data type. The `then_branch` and `else_branch` may produce tensors with the same element type and different shapes. If corresponding outputs from the then-branch and the else-branch have static shapes S1 and S2, then the shape of the corresponding output variable of the if-node (if present) must be compatible with both S1 and S2 as it represents the union of both possible shapes.For example, if in a model file, the first output of `then_branch` is typed float tensor with shape [2] and the first output of `else_branch` is another float tensor with shape [3], If's first output should have (a) no shape set, or (b) a shape of rank 1 with neither `dim_value` nor `dim_param` set, or (c) a shape of rank 1 with a unique `dim_param`. In contrast, the first output cannot have the shape [2] since [2] and [3] are not compatible.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>V</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4), seq(tensor(uint8)), seq(tensor(uint16)), seq(tensor(uint32)), seq(tensor(uint64)), seq(tensor(int8)), seq(tensor(int16)), seq(tensor(int32)), seq(tensor(int64)), seq(tensor(bfloat16)), seq(tensor(float16)), seq(tensor(float)), seq(tensor(double)), seq(tensor(string)), seq(tensor(bool)), seq(tensor(complex64)), seq(tensor(complex128)), seq(tensor(float8e4m3fn)), seq(tensor(float8e4m3fnuz)), seq(tensor(float8e5m2)), seq(tensor(float8e5m2fnuz)), seq(tensor(uint4)), seq(tensor(int4)), optional(seq(tensor(uint8))), optional(seq(tensor(uint16))), optional(seq(tensor(uint32))), optional(seq(tensor(uint64))), optional(seq(tensor(int8))), optional(seq(tensor(int16))), optional(seq(tensor(int32))), optional(seq(tensor(int64))), optional(seq(tensor(bfloat16))), optional(seq(tensor(float16))), optional(seq(tensor(float))), optional(seq(tensor(double))), optional(seq(tensor(string))), optional(seq(tensor(bool))), optional(seq(tensor(complex64))), optional(seq(tensor(complex128))), optional(tensor(uint8)), optional(tensor(uint16)), optional(tensor(uint32)), optional(tensor(uint64)), optional(tensor(int8)), optional(tensor(int16)), optional(tensor(int32)), optional(tensor(int64)), optional(tensor(bfloat16)), optional(tensor(float16)), optional(tensor(float)), optional(tensor(double)), optional(tensor(string)), optional(tensor(bool)), optional(tensor(complex64)), optional(tensor(complex128)), optional(tensor(float8e4m3fn)), optional(tensor(float8e4m3fnuz)), optional(tensor(float8e5m2)), optional(tensor(float8e5m2fnuz)), optional(tensor(uint4)), optional(tensor(int4))</dt>
<dd>All Tensor, Sequence(Tensor), Optional(Tensor), and Optional(Sequence(Tensor)) types up to IRv10.</dd>
<dt><tt>B</tt> : tensor(bool)</dt>
<dd>Only bool</dd>
</dl>


#### Examples

<details>
<summary>if</summary>

```python
# Given a bool scalar input cond.
# return constant tensor x if cond is True, otherwise return constant tensor y.

then_out = onnx.helper.make_tensor_value_info(
    "then_out", onnx.TensorProto.FLOAT, [5]
)
else_out = onnx.helper.make_tensor_value_info(
    "else_out", onnx.TensorProto.FLOAT, [5]
)

x = np.array([1, 2, 3, 4, 5]).astype(np.float32)
y = np.array([5, 4, 3, 2, 1]).astype(np.float32)

then_const_node = onnx.helper.make_node(
    "Constant",
    inputs=[],
    outputs=["then_out"],
    value=onnx.numpy_helper.from_array(x),
)

else_const_node = onnx.helper.make_node(
    "Constant",
    inputs=[],
    outputs=["else_out"],
    value=onnx.numpy_helper.from_array(y),
)

then_body = onnx.helper.make_graph(
    [then_const_node], "then_body", [], [then_out]
)

else_body = onnx.helper.make_graph(
    [else_const_node], "else_body", [], [else_out]
)

if_node = onnx.helper.make_node(
    "If",
    inputs=["cond"],
    outputs=["res"],
    then_branch=then_body,
    else_branch=else_body,
)

cond = np.array(1).astype(bool)
res = x if cond else y
expect(
    if_node,
    inputs=[cond],
    outputs=[res],
    name="test_if",
    opset_imports=[onnx.helper.make_opsetid("", 11)],
)
```

</details>


<details>
<summary>if_optional</summary>

```python
# Given a bool scalar input cond, return an empty optional sequence of
# tensor if True, return an optional sequence with value x
# (the input optional sequence) otherwise.

ten_in_tp = onnx.helper.make_tensor_type_proto(
    onnx.TensorProto.FLOAT, shape=[5]
)
seq_in_tp = onnx.helper.make_sequence_type_proto(ten_in_tp)

then_out_tensor_tp = onnx.helper.make_tensor_type_proto(
    onnx.TensorProto.FLOAT, shape=[5]
)
then_out_seq_tp = onnx.helper.make_sequence_type_proto(then_out_tensor_tp)
then_out_opt_tp = onnx.helper.make_optional_type_proto(then_out_seq_tp)
then_out = onnx.helper.make_value_info("optional_empty", then_out_opt_tp)

else_out_tensor_tp = onnx.helper.make_tensor_type_proto(
    onnx.TensorProto.FLOAT, shape=[5]
)
else_out_seq_tp = onnx.helper.make_sequence_type_proto(else_out_tensor_tp)
else_out_opt_tp = onnx.helper.make_optional_type_proto(else_out_seq_tp)
else_out = onnx.helper.make_value_info("else_opt", else_out_opt_tp)

x = [np.array([1, 2, 3, 4, 5]).astype(np.float32)]
cond = np.array(0).astype(bool)
res = compute_if_outputs(x, cond)

opt_empty_in = onnx.helper.make_node(
    "Optional", inputs=[], outputs=["optional_empty"], type=seq_in_tp
)

then_body = onnx.helper.make_graph([opt_empty_in], "then_body", [], [then_out])

else_const_node = onnx.helper.make_node(
    "Constant",
    inputs=[],
    outputs=["x"],
    value=onnx.numpy_helper.from_array(x[0]),
)

else_seq_node = onnx.helper.make_node(
    "SequenceConstruct", inputs=["x"], outputs=["else_seq"]
)

else_optional_seq_node = onnx.helper.make_node(
    "Optional", inputs=["else_seq"], outputs=["else_opt"]
)

else_body = onnx.helper.make_graph(
    [else_const_node, else_seq_node, else_optional_seq_node],
    "else_body",
    [],
    [else_out],
)

if_node = onnx.helper.make_node(
    "If",
    inputs=["cond"],
    outputs=["sequence"],
    then_branch=then_body,
    else_branch=else_body,
)

expect(
    if_node,
    inputs=[cond],
    outputs=[res],
    name="test_if_opt",
    output_type_protos=[else_out_opt_tp],
    opset_imports=[onnx.helper.make_opsetid("", 16)],
)
```

</details>


<details>
<summary>if_seq</summary>

```python
# Given a bool scalar input cond.
# return constant sequence x if cond is True, otherwise return constant sequence y.

then_out = onnx.helper.make_tensor_sequence_value_info(
    "then_out", onnx.TensorProto.FLOAT, shape=[5]
)
else_out = onnx.helper.make_tensor_sequence_value_info(
    "else_out", onnx.TensorProto.FLOAT, shape=[5]
)

x = [np.array([1, 2, 3, 4, 5]).astype(np.float32)]
y = [np.array([5, 4, 3, 2, 1]).astype(np.float32)]

then_const_node = onnx.helper.make_node(
    "Constant",
    inputs=[],
    outputs=["x"],
    value=onnx.numpy_helper.from_array(x[0]),
)

then_seq_node = onnx.helper.make_node(
    "SequenceConstruct", inputs=["x"], outputs=["then_out"]
)

else_const_node = onnx.helper.make_node(
    "Constant",
    inputs=[],
    outputs=["y"],
    value=onnx.numpy_helper.from_array(y[0]),
)

else_seq_node = onnx.helper.make_node(
    "SequenceConstruct", inputs=["y"], outputs=["else_out"]
)

then_body = onnx.helper.make_graph(
    [then_const_node, then_seq_node], "then_body", [], [then_out]
)

else_body = onnx.helper.make_graph(
    [else_const_node, else_seq_node], "else_body", [], [else_out]
)

if_node = onnx.helper.make_node(
    "If",
    inputs=["cond"],
    outputs=["res"],
    then_branch=then_body,
    else_branch=else_body,
)

cond = np.array(1).astype(bool)
res = x if cond else y
expect(
    if_node,
    inputs=[cond],
    outputs=[res],
    name="test_if_seq",
    opset_imports=[onnx.helper.make_opsetid("", 13)],
)
```

</details>


### <a name="ImageDecoder"></a><a name="imagedecoder">**ImageDecoder**</a>

  Loads and decodes and image from a file. If it can't decode for any reason (e.g. corrupted encoded
  stream, invalid format, it will return an empty matrix).
  The following image formats are supported:
  * BMP
  * JPEG (note: Lossless JPEG support is optional)
  * JPEG2000
  * TIFF
  * PNG
  * WebP
  * Portable image format (PBM, PGM, PPM, PXM, PNM)
  Decoded images follow a channel-last layout: (Height, Width, Channels).
  **JPEG chroma upsampling method:**
  When upsampling the chroma components by a factor of 2, the pixels are linearly interpolated so that the
  centers of the output pixels are 1/4 and 3/4 of the way between input pixel centers.
  When rounding, 0.5 is rounded down and up at alternative pixels locations to prevent bias towards
  larger values (ordered dither pattern).
  Considering adjacent input pixels A, B, and C, B is upsampled to pixels B0 and B1 so that
  ```
  B0 = round_half_down((1/4) * A + (3/4) * B)
  B1 = round_half_up((3/4) * B + (1/4) * C)
  ```
  This method,  is the default chroma upsampling method in the well-established libjpeg-turbo library,
  also referred as "smooth" or "fancy" upsampling.

#### Version

This version of the operator has been available since version 20 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>pixel_format</tt> : string (default is RGB)</dt>
<dd>Pixel format. Can be one of "RGB", "BGR", or "Grayscale".</dd>
</dl>

#### Inputs

<dl>
<dt><tt>encoded_stream</tt> (non-differentiable) : T1</dt>
<dd>Encoded stream</dd>
</dl>

#### Outputs

<dl>
<dt><tt>image</tt> (non-differentiable) : T2</dt>
<dd>Decoded image</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(uint8)</dt>
<dd>Constrain input types to 8-bit unsigned integer tensor.</dd>
<dt><tt>T2</tt> : tensor(uint8)</dt>
<dd>Constrain output types to 8-bit unsigned integer tensor.</dd>
</dl>


#### Examples

<details>
<summary>image_decoder_decode_bmp_rgb</summary>

```python
node = onnx.helper.make_node(
    "ImageDecoder",
    inputs=["data"],
    outputs=["output"],
    pixel_format="RGB",
)

data, output = _generate_test_data(
    "bmp", _image_decoder_data.image_decoder_decode_bmp_rgb, "RGB"
)
expect(
    node,
    inputs=[data],
    outputs=[output],
    name="test_image_decoder_decode_bmp_rgb",
)
```

</details>


<details>
<summary>image_decoder_decode_jpeg2k_rgb</summary>

```python
node = onnx.helper.make_node(
    "ImageDecoder",
    inputs=["data"],
    outputs=["output"],
    pixel_format="RGB",
)

data, output = _generate_test_data(
    "jpeg2000", _image_decoder_data.image_decoder_decode_jpeg2k_rgb, "RGB"
)
expect(
    node,
    inputs=[data],
    outputs=[output],
    name="test_image_decoder_decode_jpeg2k_rgb",
)
```

</details>


<details>
<summary>image_decoder_decode_jpeg_bgr</summary>

```python
node = onnx.helper.make_node(
    "ImageDecoder",
    inputs=["data"],
    outputs=["output"],
    pixel_format="BGR",
)

data, output = _generate_test_data(
    "jpeg", _image_decoder_data.image_decoder_decode_jpeg_bgr, "BGR"
)
expect(
    node,
    inputs=[data],
    outputs=[output],
    name="test_image_decoder_decode_jpeg_bgr",
)
```

</details>


<details>
<summary>image_decoder_decode_jpeg_grayscale</summary>

```python
node = onnx.helper.make_node(
    "ImageDecoder",
    inputs=["data"],
    outputs=["output"],
    pixel_format="Grayscale",
)

data, output = _generate_test_data(
    "jpeg", _image_decoder_data.image_decoder_decode_jpeg_grayscale, "Grayscale"
)
expect(
    node,
    inputs=[data],
    outputs=[output],
    name="test_image_decoder_decode_jpeg_grayscale",
)
```

</details>


<details>
<summary>image_decoder_decode_jpeg_rgb</summary>

```python
node = onnx.helper.make_node(
    "ImageDecoder",
    inputs=["data"],
    outputs=["output"],
    pixel_format="RGB",
)

data, output = _generate_test_data(
    "jpeg", _image_decoder_data.image_decoder_decode_jpeg_rgb, "RGB"
)
expect(
    node,
    inputs=[data],
    outputs=[output],
    name="test_image_decoder_decode_jpeg_rgb",
)
```

</details>


<details>
<summary>image_decoder_decode_png_rgb</summary>

```python
node = onnx.helper.make_node(
    "ImageDecoder",
    inputs=["data"],
    outputs=["output"],
    pixel_format="RGB",
)

data, output = _generate_test_data(
    "png", _image_decoder_data.image_decoder_decode_png_rgb, "RGB"
)
expect(
    node,
    inputs=[data],
    outputs=[output],
    name="test_image_decoder_decode_png_rgb",
)
```

</details>


<details>
<summary>image_decoder_decode_pnm_rgb</summary>

```python
node = onnx.helper.make_node(
    "ImageDecoder",
    inputs=["data"],
    outputs=["output"],
    pixel_format="RGB",
)

data, output = _generate_test_data(
    "ppm", _image_decoder_data.image_decoder_decode_pnm_rgb, "RGB"
)
expect(
    node,
    inputs=[data],
    outputs=[output],
    name="test_image_decoder_decode_pnm_rgb",
)
```

</details>


<details>
<summary>image_decoder_decode_tiff_rgb</summary>

```python
node = onnx.helper.make_node(
    "ImageDecoder",
    inputs=["data"],
    outputs=["output"],
    pixel_format="RGB",
)

data, output = _generate_test_data(
    "tiff", _image_decoder_data.image_decoder_decode_tiff_rgb, "RGB"
)
expect(
    node,
    inputs=[data],
    outputs=[output],
    name="test_image_decoder_decode_tiff_rgb",
)
```

</details>


<details>
<summary>image_decoder_decode_webp_rgb</summary>

```python
node = onnx.helper.make_node(
    "ImageDecoder",
    inputs=["data"],
    outputs=["output"],
    pixel_format="RGB",
)

data, output = _generate_test_data(
    "webp", _image_decoder_data.image_decoder_decode_webp_rgb, "RGB"
)
expect(
    node,
    inputs=[data],
    outputs=[output],
    name="test_image_decoder_decode_webp_rgb",
)
```

</details>


### <a name="InstanceNormalization"></a><a name="instancenormalization">**InstanceNormalization**</a>

  Carries out instance normalization as described in the paper
  https://arxiv.org/abs/1607.08022.

  y = scale * (x - mean) / sqrt(variance + epsilon) + B,
  where mean and variance are computed per instance per channel.


#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#InstanceNormalization-1">1</a>

#### Attributes

<dl>
<dt><tt>epsilon</tt> : float (default is 1e-05)</dt>
<dd>The epsilon value to use to avoid division by zero.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimensions are in the form of (N x C x D1 x D2 ... Dn), where N is the batch size.</dd>
<dt><tt>scale</tt> (differentiable) : T</dt>
<dd>The input 1-dimensional scale tensor of size C.</dd>
<dt><tt>B</tt> (differentiable) : T</dt>
<dd>The input 1-dimensional bias tensor of size C.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>The output tensor of the same shape as input.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>instancenormalization</summary>

```python
def _instancenorm_test_mode(x, s, bias, epsilon=1e-5):  # type: ignore
    dims_x = len(x.shape)
    axis = tuple(range(2, dims_x))
    mean = np.mean(x, axis=axis, keepdims=True)
    var = np.var(x, axis=axis, keepdims=True)
    dim_ones = (1,) * (dims_x - 2)
    s = s.reshape(-1, *dim_ones)
    bias = bias.reshape(-1, *dim_ones)
    return s * (x - mean) / np.sqrt(var + epsilon) + bias

# input size: (1, 2, 1, 3)
x = np.array([[[[-1, 0, 1]], [[2, 3, 4]]]]).astype(np.float32)
s = np.array([1.0, 1.5]).astype(np.float32)
bias = np.array([0, 1]).astype(np.float32)
y = _instancenorm_test_mode(x, s, bias).astype(np.float32)

node = onnx.helper.make_node(
    "InstanceNormalization",
    inputs=["x", "s", "bias"],
    outputs=["y"],
)

# output size: (1, 2, 1, 3)
expect(node, inputs=[x, s, bias], outputs=[y], name="test_instancenorm_example")

# input size: (2, 3, 4, 5)
x = np.random.randn(2, 3, 4, 5).astype(np.float32)
s = np.random.randn(3).astype(np.float32)
bias = np.random.randn(3).astype(np.float32)
epsilon = 1e-2
y = _instancenorm_test_mode(x, s, bias, epsilon).astype(np.float32)

node = onnx.helper.make_node(
    "InstanceNormalization",
    inputs=["x", "s", "bias"],
    outputs=["y"],
    epsilon=epsilon,
)

# output size: (2, 3, 4, 5)
expect(node, inputs=[x, s, bias], outputs=[y], name="test_instancenorm_epsilon")
```

</details>


### <a name="IsInf"></a><a name="isinf">**IsInf**</a>

  Map infinity to true and other values to false.

#### Version

This version of the operator has been available since version 20 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#IsInf-10">10</a>

#### Attributes

<dl>
<dt><tt>detect_negative</tt> : int (default is 1)</dt>
<dd>(Optional) Whether map negative infinity to true. Default to 1 so that negative infinity induces true. Set this attribute to 0 if negative infinity should be mapped to false.</dd>
<dt><tt>detect_positive</tt> : int (default is 1)</dt>
<dd>(Optional) Whether map positive infinity to true. Default to 1 so that positive infinity induces true. Set this attribute to 0 if positive infinity should be mapped to false.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> (non-differentiable) : T1</dt>
<dd>input</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (non-differentiable) : T2</dt>
<dd>output</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz)</dt>
<dd>Constrain input types to float tensors.</dd>
<dt><tt>T2</tt> : tensor(bool)</dt>
<dd>Constrain output types to boolean tensors.</dd>
</dl>


#### Examples

<details>
<summary>infinity</summary>

```python
node = onnx.helper.make_node(
    "IsInf",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([-1.2, np.nan, np.inf, 2.8, -np.inf, np.inf], dtype=np.float32)
y = np.isinf(x)
expect(node, inputs=[x], outputs=[y], name="test_isinf")
```

</details>


<details>
<summary>infinity_float16</summary>

```python
node = onnx.helper.make_node(
    "IsInf",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([-1.2, np.nan, np.inf, 2.8, -np.inf, np.inf], dtype=np.float16)
y = np.isinf(x)
expect(node, inputs=[x], outputs=[y], name="test_isinf_float16")
```

</details>


<details>
<summary>negative_infinity_only</summary>

```python
node = onnx.helper.make_node(
    "IsInf", inputs=["x"], outputs=["y"], detect_positive=0
)

x = np.array([-1.7, np.nan, np.inf, -3.6, -np.inf, np.inf], dtype=np.float32)
y = np.isneginf(x)
expect(node, inputs=[x], outputs=[y], name="test_isinf_negative")
```

</details>


<details>
<summary>positive_infinity_only</summary>

```python
node = onnx.helper.make_node(
    "IsInf", inputs=["x"], outputs=["y"], detect_negative=0
)

x = np.array([-1.7, np.nan, np.inf, 3.6, -np.inf, np.inf], dtype=np.float32)
y = np.isposinf(x)
expect(node, inputs=[x], outputs=[y], name="test_isinf_positive")
```

</details>


### <a name="IsNaN"></a><a name="isnan">**IsNaN**</a>

  Returns which elements of the input are NaN.

#### Version

This version of the operator has been available since version 20 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#IsNaN-9">9</a>, <a href="Changelog.md#IsNaN-13">13</a>

#### Inputs

<dl>
<dt><tt>X</tt> (non-differentiable) : T1</dt>
<dd>input</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (non-differentiable) : T2</dt>
<dd>output</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz)</dt>
<dd>Constrain input types to float tensors.</dd>
<dt><tt>T2</tt> : tensor(bool)</dt>
<dd>Constrain output types to boolean tensors.</dd>
</dl>


#### Examples

<details>
<summary>float16</summary>

```python
node = onnx.helper.make_node(
    "IsNaN",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([-1.2, np.nan, np.inf, 2.8, -np.inf, np.inf], dtype=np.float16)
y = np.isnan(x)
expect(node, inputs=[x], outputs=[y], name="test_isnan_float16")
```

</details>


<details>
<summary>isnan</summary>

```python
node = onnx.helper.make_node(
    "IsNaN",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([-1.2, np.nan, np.inf, 2.8, -np.inf, np.inf], dtype=np.float32)
y = np.isnan(x)
expect(node, inputs=[x], outputs=[y], name="test_isnan")
```

</details>


### <a name="LRN"></a><a name="lrn">**LRN**</a>

  Local Response Normalization proposed in the [AlexNet paper](https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf).
  It normalizes over local input regions.
  The local region is defined across the channels. For an element `X[n, c, d1, ..., dk]` in a tensor
  of shape `(N x C x D1 x D2, ..., Dk)`, its region is
  `{X[n, i, d1, ..., dk] | max(0, c - floor((size - 1) / 2)) <= i <= min(C - 1, c + ceil((size - 1) / 2))}`.

  `square_sum[n, c, d1, ..., dk] = sum(X[n, i, d1, ..., dk] ^ 2)`,
  where `max(0, c - floor((size - 1) / 2)) <= i <= min(C - 1, c + ceil((size - 1) / 2))`.

  `Y[n, c, d1, ..., dk] = X[n, c, d1, ..., dk] / (bias + alpha / size * square_sum[n, c, d1, ..., dk] ) ^ beta`

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#LRN-1">1</a>

#### Attributes

<dl>
<dt><tt>alpha</tt> : float (default is 0.0001)</dt>
<dd>Scaling parameter.</dd>
<dt><tt>beta</tt> : float (default is 0.75)</dt>
<dd>The exponent.</dd>
<dt><tt>bias</tt> : float (default is 1.0)</dt>
<dd></dd>
<dt><tt>size</tt> : int (required)</dt>
<dd>The number of channels to sum over</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimensions are in the form of (N x C x D1 x D2 ... Dn), where N is the batch size. Optionally, if dimension denotation is in effect, the operation expects the input data tensor to arrive with the dimension denotation of [DATA_BATCH, DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE ...].</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>Output tensor, which has the shape and type as input tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output  types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>default</summary>

```python
alpha = 0.0001
beta = 0.75
bias = 1.0
nsize = 3
node = onnx.helper.make_node("LRN", inputs=["x"], outputs=["y"], size=3)
x = np.random.randn(5, 5, 5, 5).astype(np.float32)
square_sum = np.zeros((5, 5, 5, 5)).astype(np.float32)
for n, c, h, w in np.ndindex(x.shape):
    square_sum[n, c, h, w] = sum(
        x[
            n,
            max(0, c - int(math.floor((nsize - 1) / 2))) : min(
                5, c + int(math.ceil((nsize - 1) / 2)) + 1
            ),
            h,
            w,
        ]
        ** 2
    )
y = x / ((bias + (alpha / nsize) * square_sum) ** beta)
expect(node, inputs=[x], outputs=[y], name="test_lrn_default")
```

</details>


<details>
<summary>lrn</summary>

```python
alpha = 0.0002
beta = 0.5
bias = 2.0
nsize = 3
node = onnx.helper.make_node(
    "LRN",
    inputs=["x"],
    outputs=["y"],
    alpha=alpha,
    beta=beta,
    bias=bias,
    size=nsize,
)
x = np.random.randn(5, 5, 5, 5).astype(np.float32)
square_sum = np.zeros((5, 5, 5, 5)).astype(np.float32)
for n, c, h, w in np.ndindex(x.shape):
    square_sum[n, c, h, w] = sum(
        x[
            n,
            max(0, c - int(math.floor((nsize - 1) / 2))) : min(
                5, c + int(math.ceil((nsize - 1) / 2)) + 1
            ),
            h,
            w,
        ]
        ** 2
    )
y = x / ((bias + (alpha / nsize) * square_sum) ** beta)
expect(node, inputs=[x], outputs=[y], name="test_lrn")
```

</details>


### <a name="LSTM"></a><a name="lstm">**LSTM**</a>

  Computes an one-layer LSTM. This operator is usually supported via some
  custom implementation such as CuDNN.

  Notations:

  * `X` - input tensor
  * `i` - input gate
  * `o` - output gate
  * `f` - forget gate
  * `c` - cell gate
  * `t` - time step (t-1 means previous time step)
  * `W[iofc]` - W parameter weight matrix for input, output, forget, and cell gates
  * `R[iofc]` - R recurrence weight matrix for input, output, forget, and cell gates
  * `Wb[iofc]` - W bias vectors for input, output, forget, and cell gates
  * `Rb[iofc]` - R bias vectors for input, output, forget, and cell gates
  * `P[iof]`  - P peephole weight vector for input, output, and forget gates
  * `WB[iofc]` - W parameter weight matrix for backward input, output, forget, and cell gates
  * `RB[iofc]` - R recurrence weight matrix for backward input, output, forget, and cell gates
  * `WBb[iofc]` - W bias vectors for backward input, output, forget, and cell gates
  * `RBb[iofc]` - R bias vectors for backward input, output, forget, and cell gates
  * `PB[iof]`  - P peephole weight vector for backward input, output, and forget gates
  * `H` - Hidden state
  * `num_directions` - 2 if direction == bidirectional else 1

  Activation functions:

  * Relu(x)                - max(0, x)
  * Tanh(x)                - (1 - e^{-2x})/(1 + e^{-2x})
  * Sigmoid(x)             - 1/(1 + e^{-x})

  NOTE: Below are optional

  * Affine(x)              - alpha*x + beta
  * LeakyRelu(x)           - x if x >= 0 else alpha * x
  * ThresholdedRelu(x)     - x if x >= alpha else 0
  * ScaledTanh(x)          - alpha*Tanh(beta*x)
  * HardSigmoid(x)         - min(max(alpha*x + beta, 0), 1)
  * Elu(x)                 - x if x >= 0 else alpha*(e^x - 1)
  * Softsign(x)            - x/(1 + |x|)
  * Softplus(x)            - log(1 + e^x)

  Equations (Default: f=Sigmoid, g=Tanh, h=Tanh):

  * it = f(Xt*(Wi^T) + Ht-1*(Ri^T) + Pi (.) Ct-1 + Wbi + Rbi)
  * ft = f(Xt*(Wf^T) + Ht-1*(Rf^T) + Pf (.) Ct-1 + Wbf + Rbf)
  * ct = g(Xt*(Wc^T) + Ht-1*(Rc^T) + Wbc + Rbc)
  * Ct = ft (.) Ct-1 + it (.) ct
  * ot = f(Xt*(Wo^T) + Ht-1*(Ro^T) + Po (.) Ct + Wbo + Rbo)
  * Ht = ot (.) h(Ct)
  This operator has **optional** inputs/outputs. See [the doc](IR.md) for more details about the representation of optional arguments. An empty string may be used in the place of an actual argument's name to indicate a missing argument. Trailing optional arguments (those not followed by an argument that is present) may also be simply omitted.

#### Version

This version of the operator has been available since version 14 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#LSTM-1">1</a>, <a href="Changelog.md#LSTM-7">7</a>

#### Attributes

<dl>
<dt><tt>activation_alpha</tt> : list of floats</dt>
<dd>Optional scaling values used by some activation functions. The values are consumed in the order of activation functions, for example (f, g, h) in LSTM. Default values are the same as of corresponding ONNX operators.For example with LeakyRelu, the default alpha is 0.01.</dd>
<dt><tt>activation_beta</tt> : list of floats</dt>
<dd>Optional scaling values used by some activation functions. The values are consumed in the order of activation functions, for example (f, g, h) in LSTM. Default values are the same as of corresponding ONNX operators.</dd>
<dt><tt>activations</tt> : list of strings</dt>
<dd>A list of 3 (or 6 if bidirectional) activation functions for input, output, forget, cell, and hidden. The activation functions must be one of the activation functions specified above. Optional: See the equations for default if not specified.</dd>
<dt><tt>clip</tt> : float</dt>
<dd>Cell clip threshold. Clipping bounds the elements of a tensor in the range of [-threshold, +threshold] and is applied to the input of activations. No clip if not specified.</dd>
<dt><tt>direction</tt> : string (default is forward)</dt>
<dd>Specify if the RNN is forward, reverse, or bidirectional. Must be one of forward (default), reverse, or bidirectional.</dd>
<dt><tt>hidden_size</tt> : int</dt>
<dd>Number of neurons in the hidden layer</dd>
<dt><tt>input_forget</tt> : int (default is 0)</dt>
<dd>Couple the input and forget gates if 1.</dd>
<dt><tt>layout</tt> : int (default is 0)</dt>
<dd>The shape format of inputs X, initial_h, initial_c and outputs Y, Y_h, Y_c. If 0, the following shapes are expected: X.shape = [seq_length, batch_size, input_size], Y.shape = [seq_length, num_directions, batch_size, hidden_size], initial_h.shape = Y_h.shape = initial_c.shape = Y_c.shape = [num_directions, batch_size, hidden_size]. If 1, the following shapes are expected: X.shape = [batch_size, seq_length, input_size], Y.shape = [batch_size, seq_length, num_directions, hidden_size], initial_h.shape = Y_h.shape = initial_c.shape = Y_c.shape = [batch_size, num_directions, hidden_size].</dd>
</dl>

#### Inputs (3 - 8)

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>The input sequences packed (and potentially padded) into one 3-D tensor with the shape of `[seq_length, batch_size, input_size]`.</dd>
<dt><tt>W</tt> (differentiable) : T</dt>
<dd>The weight tensor for the gates. Concatenation of `W[iofc]` and `WB[iofc]` (if bidirectional) along dimension 0. The tensor has shape `[num_directions, 4*hidden_size, input_size]`.</dd>
<dt><tt>R</tt> (differentiable) : T</dt>
<dd>The recurrence weight tensor. Concatenation of `R[iofc]` and `RB[iofc]` (if bidirectional) along dimension 0. This tensor has shape `[num_directions, 4*hidden_size, hidden_size]`.</dd>
<dt><tt>B</tt> (optional, differentiable) : T</dt>
<dd>The bias tensor for input gate. Concatenation of `[Wb[iofc], Rb[iofc]]`, and `[WBb[iofc], RBb[iofc]]` (if bidirectional) along dimension 0. This tensor has shape `[num_directions, 8*hidden_size]`. Optional: If not specified - assumed to be 0.</dd>
<dt><tt>sequence_lens</tt> (optional, non-differentiable) : T1</dt>
<dd>Optional tensor specifying lengths of the sequences in a batch. If not specified - assumed all sequences in the batch to have length `seq_length`. It has shape `[batch_size]`.</dd>
<dt><tt>initial_h</tt> (optional, non-differentiable) : T</dt>
<dd>Optional initial value of the hidden. If not specified - assumed to be 0. It has shape `[num_directions, batch_size, hidden_size]`.</dd>
<dt><tt>initial_c</tt> (optional, non-differentiable) : T</dt>
<dd>Optional initial value of the cell. If not specified - assumed to be 0. It has shape `[num_directions, batch_size, hidden_size]`.</dd>
<dt><tt>P</tt> (optional, differentiable) : T</dt>
<dd>The weight tensor for peepholes. Concatenation of `P[iof]` and `PB[iof]` (if bidirectional) along dimension 0. It has shape `[num_directions, 3*hidde_size]`. Optional: If not specified - assumed to be 0.</dd>
</dl>

#### Outputs (0 - 3)

<dl>
<dt><tt>Y</tt> (optional, differentiable) : T</dt>
<dd>A tensor that concats all the intermediate output values of the hidden. It has shape `[seq_length, num_directions, batch_size, hidden_size]`. </dd>
<dt><tt>Y_h</tt> (optional, differentiable) : T</dt>
<dd>The last output value of the hidden. It has shape `[num_directions, batch_size, hidden_size]`.</dd>
<dt><tt>Y_c</tt> (optional, differentiable) : T</dt>
<dd>The last output value of the cell. It has shape `[num_directions, batch_size, hidden_size]`.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
<dt><tt>T1</tt> : tensor(int32)</dt>
<dd>Constrain seq_lens to integer tensor.</dd>
</dl>


#### Examples

<details>
<summary>batchwise</summary>

```python
input = np.array([[[1.0, 2.0]], [[3.0, 4.0]], [[5.0, 6.0]]]).astype(np.float32)

input_size = 2
hidden_size = 7
weight_scale = 0.3
number_of_gates = 4
layout = 1

node = onnx.helper.make_node(
    "LSTM",
    inputs=["X", "W", "R"],
    outputs=["Y", "Y_h"],
    hidden_size=hidden_size,
    layout=layout,
)

W = weight_scale * np.ones(
    (1, number_of_gates * hidden_size, input_size)
).astype(np.float32)
R = weight_scale * np.ones(
    (1, number_of_gates * hidden_size, hidden_size)
).astype(np.float32)

lstm = LSTMHelper(X=input, W=W, R=R, layout=layout)
Y, Y_h = lstm.step()
expect(
    node,
    inputs=[input, W, R],
    outputs=[Y.astype(np.float32), Y_h.astype(np.float32)],
    name="test_lstm_batchwise",
)
```

</details>


<details>
<summary>defaults</summary>

```python
input = np.array([[[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]]]).astype(np.float32)

input_size = 2
hidden_size = 3
weight_scale = 0.1
number_of_gates = 4

node = onnx.helper.make_node(
    "LSTM", inputs=["X", "W", "R"], outputs=["", "Y_h"], hidden_size=hidden_size
)

W = weight_scale * np.ones(
    (1, number_of_gates * hidden_size, input_size)
).astype(np.float32)
R = weight_scale * np.ones(
    (1, number_of_gates * hidden_size, hidden_size)
).astype(np.float32)

lstm = LSTMHelper(X=input, W=W, R=R)
_, Y_h = lstm.step()
expect(
    node,
    inputs=[input, W, R],
    outputs=[Y_h.astype(np.float32)],
    name="test_lstm_defaults",
)
```

</details>


<details>
<summary>initial_bias</summary>

```python
input = np.array([[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0], [7.0, 8.0, 9.0]]]).astype(
    np.float32
)

input_size = 3
hidden_size = 4
weight_scale = 0.1
custom_bias = 0.1
number_of_gates = 4

node = onnx.helper.make_node(
    "LSTM",
    inputs=["X", "W", "R", "B"],
    outputs=["", "Y_h"],
    hidden_size=hidden_size,
)

W = weight_scale * np.ones(
    (1, number_of_gates * hidden_size, input_size)
).astype(np.float32)
R = weight_scale * np.ones(
    (1, number_of_gates * hidden_size, hidden_size)
).astype(np.float32)

# Adding custom bias
W_B = custom_bias * np.ones((1, number_of_gates * hidden_size)).astype(
    np.float32
)
R_B = np.zeros((1, number_of_gates * hidden_size)).astype(np.float32)
B = np.concatenate((W_B, R_B), 1)

lstm = LSTMHelper(X=input, W=W, R=R, B=B)
_, Y_h = lstm.step()
expect(
    node,
    inputs=[input, W, R, B],
    outputs=[Y_h.astype(np.float32)],
    name="test_lstm_with_initial_bias",
)
```

</details>


<details>
<summary>peepholes</summary>

```python
input = np.array([[[1.0, 2.0, 3.0, 4.0], [5.0, 6.0, 7.0, 8.0]]]).astype(
    np.float32
)

input_size = 4
hidden_size = 3
weight_scale = 0.1
number_of_gates = 4
number_of_peepholes = 3

node = onnx.helper.make_node(
    "LSTM",
    inputs=["X", "W", "R", "B", "sequence_lens", "initial_h", "initial_c", "P"],
    outputs=["", "Y_h"],
    hidden_size=hidden_size,
)

# Initializing Inputs
W = weight_scale * np.ones(
    (1, number_of_gates * hidden_size, input_size)
).astype(np.float32)
R = weight_scale * np.ones(
    (1, number_of_gates * hidden_size, hidden_size)
).astype(np.float32)
B = np.zeros((1, 2 * number_of_gates * hidden_size)).astype(np.float32)
seq_lens = np.repeat(input.shape[0], input.shape[1]).astype(np.int32)
init_h = np.zeros((1, input.shape[1], hidden_size)).astype(np.float32)
init_c = np.zeros((1, input.shape[1], hidden_size)).astype(np.float32)
P = weight_scale * np.ones((1, number_of_peepholes * hidden_size)).astype(
    np.float32
)

lstm = LSTMHelper(
    X=input, W=W, R=R, B=B, P=P, initial_c=init_c, initial_h=init_h
)
_, Y_h = lstm.step()
expect(
    node,
    inputs=[input, W, R, B, seq_lens, init_h, init_c, P],
    outputs=[Y_h.astype(np.float32)],
    name="test_lstm_with_peepholes",
)
```

</details>


### <a name="LayerNormalization"></a><a name="layernormalization">**LayerNormalization**</a>

  This is layer normalization defined in ONNX as function.
        The overall computation can be split into two stages.
        The first stage is standardization, which makes the
        normalized elements have zero mean and unit variances.
        The computation required by standardization can be
        described by the following equations.
        ```
        Mean = ReduceMean<axes=normalized_axes>(X)
        D = Sub(X, Mean)
        DD = Mul(D, D)
        Var = ReduceMean<axes=normalized_axes>(DD)
        VarEps = Add(Var, epsilon)
        StdDev = Sqrt(VarEps)
        InvStdDev = Reciprocal(StdDev)
        Normalized = Mul(D, InvStdDev)
        ```
        where `normalized_axes` is `[axis, ..., rank of X - 1]`.
        The variables `Var` and `StdDev` stand for variance and
        standard deviation, respectively. The second output is
        `Mean` and the last one is `InvStdDev`.
        Depending on `stash_type` attribute, the actual computation
        must happen in different floating-point precision.
        For example, if `stash_type` is 1, this operator casts
        all input variables to 32-bit float, perform the computation, and
        finally cast `Normalized` back to the original type of `X`.
        The second stage then scales and shifts the outcome of the
        first stage using
        ```
        NormalizedScaled = Mul(Normalized, Scale)
        Y = Add(NormalizedScaled, B)
        ```
        The second stage doesn't depends on `stash_type`.
        All equations are in [this syntax](https://github.com/onnx/onnx/blob/main/docs/Syntax.md).
        The same variable (i.e., input, output, and attribute) uses
        the same name in the equations above and this operator's definition.
        Let `d[i]` indicate the i-th dimension of `X`.
        If `X`'s shape is `[d[0], ..., d[axis-1], d[axis], ..., d[rank-1]]`,
        the shape of `Mean` and `InvStdDev` is `[d[0], ..., d[axis-1], 1, ..., 1]`.
        `Y` and `X` have the same shape. This operator supports unidirectional broadcasting
        (tensors `Scale` and `B` should be unidirectional broadcastable to tensor `X`);
        for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 17 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int (default is -1)</dt>
<dd>The first normalization dimension. If rank(X) is r, axis' allowed range is [-r, r). Negative value means counting dimensions from the back.</dd>
<dt><tt>epsilon</tt> : float (default is 1e-05)</dt>
<dd>The epsilon value to use to avoid division by zero.</dd>
<dt><tt>stash_type</tt> : int (default is 1)</dt>
<dd>Type of Mean and InvStdDev. This also specifies stage one's computation precision.</dd>
</dl>

#### Inputs (2 - 3)

<dl>
<dt><tt>X</tt> : T</dt>
<dd>Tensor to be normalized.</dd>
<dt><tt>Scale</tt> : T</dt>
<dd>Scale tensor.</dd>
<dt><tt>B</tt> (optional) : T</dt>
<dd>Bias tensor.</dd>
</dl>

#### Outputs (1 - 3)

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Normalized tensor.</dd>
<dt><tt>Mean</tt> (optional) : U</dt>
<dd>Saved mean used during training to speed up gradient computation</dd>
<dt><tt>InvStdDev</tt> (optional) : U</dt>
<dd>Saved inverse standard deviation used during training to speed up gradient computation.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input types and output Y type to float tensors.</dd>
<dt><tt>U</tt> : tensor(float), tensor(bfloat16)</dt>
<dd>Type of Mean and InvStdDev tensors.</dd>
</dl>


#### Examples

<details>
<summary>d</summary>

```python
X = np.random.randn(3, 4).astype(np.float32)

def case(axis: int) -> None:
    normalized_shape = calculate_normalized_shape(X.shape, axis)
    W = np.random.randn(*normalized_shape).astype(np.float32)
    B = np.random.randn(*normalized_shape).astype(np.float32)
    Y, mean, inv_std_dev = _layer_normalization(X, W, B, axis=axis)

    node = onnx.helper.make_node(
        "LayerNormalization",
        inputs=["X", "W", "B"],
        outputs=["Y", "Mean", "InvStdDev"],
        axis=axis,
    )

    if axis < 0:
        name = f"test_layer_normalization_2d_axis_negative_{-axis}"
    else:
        name = f"test_layer_normalization_2d_axis{axis}"

    expect(node, inputs=[X, W, B], outputs=[Y, mean, inv_std_dev], name=name)

for i in range(len(X.shape)):
    case(i)
    case(i - len(X.shape))
```

</details>


<details>
<summary>d_epsilon</summary>

```python
epsilon = 1e-1
X = np.random.randn(2, 3, 5).astype(np.float32)

def case(axis: int) -> None:
    normalized_shape = calculate_normalized_shape(X.shape, axis)
    W = np.random.randn(*normalized_shape).astype(np.float32)
    B = np.random.randn(*normalized_shape).astype(np.float32)
    Y, mean, inv_std_dev = _layer_normalization(X, W, B, axis, epsilon)
    node = onnx.helper.make_node(
        "LayerNormalization",
        inputs=["X", "W", "B"],
        outputs=["Y", "Mean", "InvStdDev"],
        axis=axis,
        epsilon=epsilon,
    )

    if axis < 0:
        name = f"test_layer_normalization_3d_axis_negative_{-axis}_epsilon"
    else:
        name = f"test_layer_normalization_3d_axis{axis}_epsilon"

    expect(node, inputs=[X, W, B], outputs=[Y, mean, inv_std_dev], name=name)

for i in range(len(X.shape)):
    case(i)
    case(i - len(X.shape))
```

</details>


<details>
<summary>default_axis</summary>

```python
X = np.random.randn(2, 3, 4, 5).astype(np.float32)

# Default axis in LayerNormalization is -1.
normalized_shape = calculate_normalized_shape(X.shape, -1)
W = np.random.randn(*normalized_shape).astype(np.float32)
B = np.random.randn(*normalized_shape).astype(np.float32)
# Axis is default to -1 in the reference implementation.
Y, mean, inv_std_dev = _layer_normalization(X, W, B)

# Not specifying axis attribute means -1.
node = onnx.helper.make_node(
    "LayerNormalization",
    inputs=["X", "W", "B"],
    outputs=["Y", "Mean", "InvStdDev"],
)

expect(
    node,
    inputs=[X, W, B],
    outputs=[Y, mean, inv_std_dev],
    name="test_layer_normalization_default_axis",
)
```

</details>


<details>
<summary>layernormalization</summary>

```python
X = np.random.randn(2, 3, 4, 5).astype(np.float32)

def case(axis: int) -> None:
    normalized_shape = calculate_normalized_shape(X.shape, axis)
    W = np.random.randn(*normalized_shape).astype(np.float32)
    B = np.random.randn(*normalized_shape).astype(np.float32)
    Y, mean, inv_std_dev = _layer_normalization(X, W, B, axis)

    node = onnx.helper.make_node(
        "LayerNormalization",
        inputs=["X", "W", "B"],
        outputs=["Y", "Mean", "InvStdDev"],
        axis=axis,
    )

    if axis < 0:
        name = f"test_layer_normalization_4d_axis_negative_{-axis}"
    else:
        name = f"test_layer_normalization_4d_axis{axis}"

    expect(node, inputs=[X, W, B], outputs=[Y, mean, inv_std_dev], name=name)

for i in range(len(X.shape)):
    case(i)
    case(i - len(X.shape))
```

</details>


### <a name="LeakyRelu"></a><a name="leakyrelu">**LeakyRelu**</a>

  LeakyRelu takes input data (Tensor<T>) and an argument alpha, and produces one
  output data (Tensor<T>) where the function `f(x) = alpha * x for x < 0`,
  `f(x) = x for x >= 0`, is applied to the data tensor elementwise.

#### Version

This version of the operator has been available since version 16 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#LeakyRelu-1">1</a>, <a href="Changelog.md#LeakyRelu-6">6</a>

#### Attributes

<dl>
<dt><tt>alpha</tt> : float (default is 0.01)</dt>
<dd>Coefficient of leakage.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(bfloat16), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>leakyrelu</summary>

```python
node = onnx.helper.make_node(
    "LeakyRelu", inputs=["x"], outputs=["y"], alpha=0.1
)

x = np.array([-1, 0, 1]).astype(np.float32)
# expected output [-0.1, 0., 1.]
y = np.clip(x, 0, np.inf) + np.clip(x, -np.inf, 0) * 0.1
expect(node, inputs=[x], outputs=[y], name="test_leakyrelu_example")

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.clip(x, 0, np.inf) + np.clip(x, -np.inf, 0) * 0.1
expect(node, inputs=[x], outputs=[y], name="test_leakyrelu")
```

</details>


<details>
<summary>leakyrelu_default</summary>

```python
default_alpha = 0.01
node = onnx.helper.make_node(
    "LeakyRelu",
    inputs=["x"],
    outputs=["y"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.clip(x, 0, np.inf) + np.clip(x, -np.inf, 0) * default_alpha
expect(node, inputs=[x], outputs=[y], name="test_leakyrelu_default")
```

</details>


### <a name="Less"></a><a name="less">**Less**</a>

  Returns the tensor resulted from performing the `less` logical operation
  elementwise on the input tensors `A` and `B` (with Numpy-style broadcasting support).

  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Less-1">1</a>, <a href="Changelog.md#Less-7">7</a>, <a href="Changelog.md#Less-9">9</a>

#### Inputs

<dl>
<dt><tt>A</tt> (non-differentiable) : T</dt>
<dd>First input operand for the logical operator.</dd>
<dt><tt>B</tt> (non-differentiable) : T</dt>
<dd>Second input operand for the logical operator.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> (non-differentiable) : T1</dt>
<dd>Result tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input types to all numeric tensors.</dd>
<dt><tt>T1</tt> : tensor(bool)</dt>
<dd>Constrain output to boolean tensor.</dd>
</dl>


#### Examples

<details>
<summary>less</summary>

```python
node = onnx.helper.make_node(
    "Less",
    inputs=["x", "y"],
    outputs=["less"],
)

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.randn(3, 4, 5).astype(np.float32)
z = np.less(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_less")
```

</details>


<details>
<summary>less</summary>

```python
node = onnx.helper.make_node(
    "LessOrEqual",
    inputs=["x", "y"],
    outputs=["less_equal"],
)

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.randn(3, 4, 5).astype(np.float32)
z = np.less_equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_less_equal")
```

</details>


<details>
<summary>less_broadcast</summary>

```python
node = onnx.helper.make_node(
    "Less",
    inputs=["x", "y"],
    outputs=["less"],
)

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.randn(5).astype(np.float32)
z = np.less(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_less_bcast")
```

</details>


<details>
<summary>less_broadcast</summary>

```python
node = onnx.helper.make_node(
    "LessOrEqual",
    inputs=["x", "y"],
    outputs=["less_equal"],
)

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.randn(5).astype(np.float32)
z = np.less_equal(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_less_equal_bcast")
```

</details>


### <a name="LessOrEqual"></a><a name="lessorequal">**LessOrEqual**</a>

  Returns the tensor resulted from performing the `less_equal` logical operation
  elementwise on the input tensors `A` and `B` (with Numpy-style broadcasting support).

  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 16 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#LessOrEqual-12">12</a>

#### Inputs

<dl>
<dt><tt>A</tt> (non-differentiable) : T</dt>
<dd>First input operand for the logical operator.</dd>
<dt><tt>B</tt> (non-differentiable) : T</dt>
<dd>Second input operand for the logical operator.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> (non-differentiable) : T1</dt>
<dd>Result tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input types to all numeric tensors.</dd>
<dt><tt>T1</tt> : tensor(bool)</dt>
<dd>Constrain output to boolean tensor.</dd>
</dl>


### <a name="Log"></a><a name="log">**Log**</a>

  Calculates the natural log of the given input tensor, element-wise.

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Log-1">1</a>, <a href="Changelog.md#Log-6">6</a>

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>The natural log of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>log</summary>

```python
node = onnx.helper.make_node(
    "Log",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([1, 10]).astype(np.float32)
y = np.log(x)  # expected output [0., 2.30258512]
expect(node, inputs=[x], outputs=[y], name="test_log_example")

x = np.exp(np.random.randn(3, 4, 5).astype(np.float32))
y = np.log(x)
expect(node, inputs=[x], outputs=[y], name="test_log")
```

</details>


### <a name="LogSoftmax"></a><a name="logsoftmax">**LogSoftmax**</a>

  The operator computes the log of softmax values for the given input:

   LogSoftmax(input, axis) = Log(Softmax(input, axis=axis))

  The "axis" attribute indicates the dimension along which LogSoftmax
  will be performed. The output tensor has the same shape
  and contains the LogSoftmax values of the corresponding input.

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#LogSoftmax-1">1</a>, <a href="Changelog.md#LogSoftmax-11">11</a>

#### Attributes

<dl>
<dt><tt>axis</tt> : int (default is -1)</dt>
<dd>
Describes the dimension LogSoftmax will be performed on.
Negative value means counting dimensions
from the back. Accepted range is [-r, r-1] where r = rank(input).
</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>The input tensor of rank >= axis.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>The output values with the same shape as the input tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>logsoftmax</summary>

```python
node = onnx.helper.make_node(
    "LogSoftmax",
    inputs=["x"],
    outputs=["y"],
)
x = np.array([[-1, 0, 1]]).astype(np.float32)
# expected output
# [[-2.4076061 -1.407606  -0.407606 ]]
y = logsoftmax(x)
expect(node, inputs=[x], outputs=[y], name="test_logsoftmax_example_1")
```

</details>


<details>
<summary>logsoftmax_axis</summary>

```python
x = np.array([[0, 1, 2, 3], [10000, 10001, 10002, 10003]]).astype(np.float32)
# expected output
# [[-3.4401896  -2.4401896  -1.4401896  -0.44018966]
# [-3.4401896  -2.4401896  -1.4401896  -0.44018966]]
y = logsoftmax(x)

node = onnx.helper.make_node(
    "LogSoftmax",
    inputs=["x"],
    outputs=["y"],
)
expect(node, inputs=[x], outputs=[y], name="test_logsoftmax_large_number")

x = np.abs(np.random.randn(3, 4, 5).astype(np.float32))
node = onnx.helper.make_node(
    "LogSoftmax",
    inputs=["x"],
    outputs=["y"],
    axis=0,
)
y = logsoftmax(x, axis=0)
expect(node, inputs=[x], outputs=[y], name="test_logsoftmax_axis_0")

node = onnx.helper.make_node(
    "LogSoftmax",
    inputs=["x"],
    outputs=["y"],
    axis=1,
)
y = logsoftmax(x, axis=1)
expect(node, inputs=[x], outputs=[y], name="test_logsoftmax_axis_1")

node = onnx.helper.make_node(
    "LogSoftmax",
    inputs=["x"],
    outputs=["y"],
    axis=2,
)
y = logsoftmax(x, axis=2)
expect(node, inputs=[x], outputs=[y], name="test_logsoftmax_axis_2")

node = onnx.helper.make_node(
    "LogSoftmax",
    inputs=["x"],
    outputs=["y"],
    axis=-1,
)
y = logsoftmax(x, axis=-1)
expect(node, inputs=[x], outputs=[y], name="test_logsoftmax_negative_axis")

# default axis is -1
node = onnx.helper.make_node(
    "LogSoftmax",
    inputs=["x"],
    outputs=["y"],
)
expect(node, inputs=[x], outputs=[y], name="test_logsoftmax_default_axis")
```

</details>


### <a name="Loop"></a><a name="loop">**Loop**</a>

  Generic Looping construct. This loop has multiple termination conditions:

  1) Trip count. Iteration count specified at runtime. Set by
     specifying the input M. Optional. Set to empty string to omit.
     Note that a static trip count (specified at graph construction time) can be
     specified by passing in a constant node for input M.
  2) Loop termination condition. This is an input to the op that determines
     whether to run the first iteration and also a loop-carried dependency for
     the body graph. The body graph must yield a value for the condition variable,
     whether this input is provided or not.

  This table summarizes the operating modes of this operator with equivalent
  C-style code:

  Operator inputs defined as (max_trip_count, condition_var).

  * input ("", ""):
          for (int i=0; ; ++i) {
            cond = ... // Note this value is ignored, but is required in the body
          }

  * input ("", cond) // Note this is analogous to a while loop
          bool cond = ...;
          for (int i=0; cond; ++i) {
            cond = ...;
          }

  * input ("", 1) // Note this is analogous to a do-while loop
          bool cond = true
          for (int i=0; cond; ++i) {
            cond = ...;
          }

  * input (trip_count, "") // Note this is analogous to a for loop
          int trip_count = ...
          for (int i=0; i < trip_count; ++i) {
            cond = ...; // ignored
          }

  * input (trip_count, cond)
          int trip_count = ...;
          bool cond = ...;
          for (int i=0; i < trip_count && cond; ++i) {
            cond = ...;
          }


  *Sample usage - cond as well as trip count*

      graph predict-net {
        %a = Constant[value = <Scalar Tensor [3]>]()
        %b = Constant[value = <Scalar Tensor [6]>]()
        %keepgoing = Constant[value = <Scalar Tensor [1]>]()
        %max_trip_count = Constant[value = <Scalar Tensor [10]>]()
        %keepgoing_out, %b_out, %user_defined_vals = Loop[body = <graph body-net>](%max_trip_count, %keepgoing, %b)
        return
      }

      graph body-net (
        %i[INT32, scalar]           // iteration number
        %keepgoing_in[BOOL, scalar] // incoming loop-termination-condition; not used
        %b_in[INT32, scalar]        // incoming value of loop-carried-dependency b
      ) {
        %my_local = Add(%a, %b_in)
        %b_out = Sub(%a, %b_in) // outgoing value of loop-carried-dependency b
        %keepgoing_out = Greater(%my_local, %b_out) // outgoing loop-termination-condition
        %user_defined_val = Add(%b_in, %b_in) // scan-output value to be accumulated
        return %keepgoing_out, %b_out, %user_defined_val
      }

  *Sample equivalent C code*

      {
        /* User-defined code (enclosing scope) */
        int a = 3, b = 6;
        bool keepgoing = true; // Analogous to input cond
        /* End user-defined code */

        /* Implicitly-defined code */
        const int max_trip_count = 10; // Analogous to input M
        int user_defined_vals[]; // Imagine this is resizable
        /* End implicitly-defined code */
        /* initialize loop-carried variables and scan-output variables */
        bool keepgoing_out = keepgoing
        int b_out = b

        for (int i=0; i < max_trip_count && keepgoing_out; ++i) {
          /* Implicitly-defined code: bind actual parameter values
             to formal parameter variables of loop-body */
          bool keepgoing_in = keepgoing_out;
          bool b_in = b_out;

          /* User-defined code (loop body) */
          int my_local = a + b_in; // Reading value "a" from the enclosing scope is fine
          b_out = a - b_in;
          keepgoing_out = my_local > b_out;
          user_defined_val = b_in + b_in; // b_in and b_out are different variables
          /* End user-defined code */

          /* Implicitly defined-code */
          user_defined_vals[i] = user_defined_val // accumulate scan-output values
        }
        // int t = my_local; // Can't do this. my_local is not accessible here.

        // The values below are bound to the output variables of the loop and therefore accessible
        // b_out; user_defined_vals; keepgoing_out;
      }

  There are several things of note in this code snippet:

  1) Values from the enclosing scope (i.e. variable "a" here) are in scope and can
     be referenced in the inputs of the loop.
  2) Any values computed in the loop body that needs to be used in a subsequent
     iteration or after the loop are modelled using a pair of variables in the loop-body,
     consisting of an input variable (eg., b_in) and an output variable (eg., b_out).
     These are referred to as loop-carried dependences. The loop operation node
     supplies the input value of the input variable for the first iteration, and
     returns the output value of the output variable produced by the final
     iteration.
  3) Scan_output variables are used to implicitly concatenate values computed across
     all the iterations. In the above example, the value of user_defined_val computed
     over all iterations are concatenated and returned as the value of user_defined_vals
     after the loop.
  4) Values created in the body cannot be accessed in the enclosing scope,
     except using the mechanism described above.

  Note that the semantics of this op support "diagonal" or "wavefront" execution.
  (See Step 3 here for an example:
  https://devblogs.nvidia.com/optimizing-recurrent-neural-networks-cudnn-5/).
  Frontends should emit multi-layer RNNs as a series of While operators (with
  time being the inner looping dimension), with each successive layer consuming
  the scan_outputs from the previous layer, possibly going through several
  point-wise operators (e.g. dropout, residual connections, linear layer).

  The input/output of subgraph (produced by loop node) matching is based on order instead of name. The implementation will figure out the names based on this order.

#### Version

This version of the operator has been available since version 21 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Loop-1">1</a>, <a href="Changelog.md#Loop-11">11</a>, <a href="Changelog.md#Loop-13">13</a>, <a href="Changelog.md#Loop-16">16</a>, <a href="Changelog.md#Loop-19">19</a>

#### Attributes

<dl>
<dt><tt>body</tt> : graph (required)</dt>
<dd>The graph run each iteration. It has 2+N inputs: (iteration_num, condition, loop carried dependencies...). It has 1+N+K outputs: (condition, loop carried dependencies..., scan_outputs...). Each scan_output is created by concatenating the value of the specified output value at the end of each iteration of the loop. It is an error if the dimensions or data type of these scan_outputs change across loop iterations.</dd>
</dl>

#### Inputs (2 - &#8734;)

<dl>
<dt><tt>M</tt> (optional) : I</dt>
<dd>A maximum trip-count for the loop specified at runtime. Optional. Pass empty string to skip.</dd>
<dt><tt>cond</tt> (optional) : B</dt>
<dd>A boolean termination condition. Optional. Pass empty string to skip.</dd>
<dt><tt>v_initial</tt> (variadic, heterogeneous) : V</dt>
<dd>The initial values of any loop-carried dependencies (values that change across loop iterations)</dd>
</dl>

#### Outputs (1 - &#8734;)

<dl>
<dt><tt>v_final_and_scan_outputs</tt> (variadic, heterogeneous) : V</dt>
<dd>Final N loop carried dependency values then K scan_outputs. Scan outputs must be Tensors.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>V</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4), seq(tensor(uint8)), seq(tensor(uint16)), seq(tensor(uint32)), seq(tensor(uint64)), seq(tensor(int8)), seq(tensor(int16)), seq(tensor(int32)), seq(tensor(int64)), seq(tensor(bfloat16)), seq(tensor(float16)), seq(tensor(float)), seq(tensor(double)), seq(tensor(string)), seq(tensor(bool)), seq(tensor(complex64)), seq(tensor(complex128)), seq(tensor(float8e4m3fn)), seq(tensor(float8e4m3fnuz)), seq(tensor(float8e5m2)), seq(tensor(float8e5m2fnuz)), seq(tensor(uint4)), seq(tensor(int4)), optional(seq(tensor(uint8))), optional(seq(tensor(uint16))), optional(seq(tensor(uint32))), optional(seq(tensor(uint64))), optional(seq(tensor(int8))), optional(seq(tensor(int16))), optional(seq(tensor(int32))), optional(seq(tensor(int64))), optional(seq(tensor(bfloat16))), optional(seq(tensor(float16))), optional(seq(tensor(float))), optional(seq(tensor(double))), optional(seq(tensor(string))), optional(seq(tensor(bool))), optional(seq(tensor(complex64))), optional(seq(tensor(complex128))), optional(tensor(uint8)), optional(tensor(uint16)), optional(tensor(uint32)), optional(tensor(uint64)), optional(tensor(int8)), optional(tensor(int16)), optional(tensor(int32)), optional(tensor(int64)), optional(tensor(bfloat16)), optional(tensor(float16)), optional(tensor(float)), optional(tensor(double)), optional(tensor(string)), optional(tensor(bool)), optional(tensor(complex64)), optional(tensor(complex128)), optional(tensor(float8e4m3fn)), optional(tensor(float8e4m3fnuz)), optional(tensor(float8e5m2)), optional(tensor(float8e5m2fnuz)), optional(tensor(uint4)), optional(tensor(int4))</dt>
<dd>All Tensor, Sequence(Tensor), Optional(Tensor), and Optional(Sequence(Tensor)) types up to IRv10.</dd>
<dt><tt>I</tt> : tensor(int64)</dt>
<dd>tensor of int64, which should be a scalar.</dd>
<dt><tt>B</tt> : tensor(bool)</dt>
<dd>tensor of bool, which should be a scalar.</dd>
</dl>


#### Examples

<details>
<summary>loop_11</summary>

```python
# Given a tensor x of values [x1, ..., xN], and initial tensor y
# sum up its elements using a scan
# returning the final state (y+x1+x2+...+xN) as well the scan_output
# [y+x1, y+x1+x2, ..., y+x1+x2+...+xN]

y_in = onnx.helper.make_tensor_value_info("y_in", onnx.TensorProto.FLOAT, [1])
y_out = onnx.helper.make_tensor_value_info("y_out", onnx.TensorProto.FLOAT, [1])
scan_out = onnx.helper.make_tensor_value_info(
    "scan_out", onnx.TensorProto.FLOAT, [1]
)
cond_in = onnx.helper.make_tensor_value_info(
    "cond_in", onnx.TensorProto.BOOL, []
)
cond_out = onnx.helper.make_tensor_value_info(
    "cond_out", onnx.TensorProto.BOOL, []
)
iter_count = onnx.helper.make_tensor_value_info(
    "iter_count", onnx.TensorProto.INT64, []
)

x = np.array([1, 2, 3, 4, 5]).astype(np.float32)
y = np.array([-2]).astype(np.float32)

x_const_node = onnx.helper.make_node(
    "Constant",
    inputs=[],
    outputs=["x"],
    value=onnx.helper.make_tensor(
        name="const_tensor_x",
        data_type=onnx.TensorProto.FLOAT,
        dims=x.shape,
        vals=x.flatten().astype(float),
    ),
)

one_const_node = onnx.helper.make_node(
    "Constant",
    inputs=[],
    outputs=["one"],
    value=onnx.helper.make_tensor(
        name="const_tensor_one",
        data_type=onnx.TensorProto.INT64,
        dims=(),
        vals=[1],
    ),
)

i_add_node = onnx.helper.make_node(
    "Add", inputs=["iter_count", "one"], outputs=["end"]
)

start_unsqueeze_node = onnx.helper.make_node(
    "Unsqueeze", inputs=["iter_count"], outputs=["slice_start"], axes=[0]
)

end_unsqueeze_node = onnx.helper.make_node(
    "Unsqueeze", inputs=["end"], outputs=["slice_end"], axes=[0]
)

slice_node = onnx.helper.make_node(
    "Slice", inputs=["x", "slice_start", "slice_end"], outputs=["slice_out"]
)

y_add_node = onnx.helper.make_node(
    "Add", inputs=["y_in", "slice_out"], outputs=["y_out"]
)

identity_node = onnx.helper.make_node(
    "Identity", inputs=["cond_in"], outputs=["cond_out"]
)

scan_identity_node = onnx.helper.make_node(
    "Identity", inputs=["y_out"], outputs=["scan_out"]
)

loop_body = onnx.helper.make_graph(
    [
        identity_node,
        x_const_node,
        one_const_node,
        i_add_node,
        start_unsqueeze_node,
        end_unsqueeze_node,
        slice_node,
        y_add_node,
        scan_identity_node,
    ],
    "loop_body",
    [iter_count, cond_in, y_in],
    [cond_out, y_out, scan_out],
)

node = onnx.helper.make_node(
    "Loop",
    inputs=["trip_count", "cond", "y"],
    outputs=["res_y", "res_scan"],
    body=loop_body,
)

trip_count = np.array(5).astype(np.int64)
res_y = np.array([13]).astype(np.float32)
cond = np.array(1).astype(bool)
res_scan = np.array([-1, 1, 4, 8, 13]).astype(np.float32).reshape((5, 1))
expect(
    node,
    inputs=[trip_count, cond, y],
    outputs=[res_y, res_scan],
    name="test_loop11",
    opset_imports=[onnx.helper.make_opsetid("", 11)],
)
```

</details>


<details>
<summary>loop_13</summary>

```python
# Given a tensor x of values [x1, ..., xN],
# Return a sequence of tensors of
#   [[x1], [x1, x2], ..., [x1, ..., xN]]

seq_in = onnx.helper.make_tensor_sequence_value_info(
    "seq_in", onnx.TensorProto.FLOAT, None
)
seq_out = onnx.helper.make_tensor_sequence_value_info(
    "seq_out", onnx.TensorProto.FLOAT, None
)
cond_in = onnx.helper.make_tensor_value_info(
    "cond_in", onnx.TensorProto.BOOL, []
)
cond_out = onnx.helper.make_tensor_value_info(
    "cond_out", onnx.TensorProto.BOOL, []
)
iter_count = onnx.helper.make_tensor_value_info(
    "iter_count", onnx.TensorProto.INT64, []
)

x = np.array([1, 2, 3, 4, 5]).astype(np.float32)

x_const_node = onnx.helper.make_node(
    "Constant",
    inputs=[],
    outputs=["x"],
    value=onnx.helper.make_tensor(
        name="const_tensor_x",
        data_type=onnx.TensorProto.FLOAT,
        dims=x.shape,
        vals=x.flatten().astype(float),
    ),
)

one_const_node = onnx.helper.make_node(
    "Constant",
    inputs=[],
    outputs=["one"],
    value=onnx.helper.make_tensor(
        name="const_tensor_one",
        data_type=onnx.TensorProto.INT64,
        dims=(),
        vals=[1],
    ),
)

zero_const_node = onnx.helper.make_node(
    "Constant",
    inputs=[],
    outputs=["slice_start"],
    value=onnx.helper.make_tensor(
        name="const_tensor_zero",
        data_type=onnx.TensorProto.INT64,
        dims=(1,),
        vals=[0],
    ),
)

axes_node = onnx.helper.make_node(
    "Constant",
    inputs=[],
    outputs=["axes"],
    value=onnx.helper.make_tensor(
        name="const_tensor_axes",
        data_type=onnx.TensorProto.INT64,
        dims=(),
        vals=[0],
    ),
)

add_node = onnx.helper.make_node(
    "Add", inputs=["iter_count", "one"], outputs=["end"]
)

end_unsqueeze_node = onnx.helper.make_node(
    "Unsqueeze", inputs=["end", "axes"], outputs=["slice_end"]
)

slice_node = onnx.helper.make_node(
    "Slice", inputs=["x", "slice_start", "slice_end"], outputs=["slice_out"]
)

insert_node = onnx.helper.make_node(
    "SequenceInsert", inputs=["seq_in", "slice_out"], outputs=["seq_out"]
)

identity_node = onnx.helper.make_node(
    "Identity", inputs=["cond_in"], outputs=["cond_out"]
)

loop_body = onnx.helper.make_graph(
    [
        identity_node,
        x_const_node,
        one_const_node,
        zero_const_node,
        add_node,
        axes_node,
        end_unsqueeze_node,
        slice_node,
        insert_node,
    ],
    "loop_body",
    [iter_count, cond_in, seq_in],
    [cond_out, seq_out],
)

node = onnx.helper.make_node(
    "Loop",
    inputs=["trip_count", "cond", "seq_empty"],
    outputs=["seq_res"],
    body=loop_body,
)

trip_count = np.array(5).astype(np.int64)
seq_empty: List[Any] = []
seq_res = [x[: int(i)] for i in x]
cond = np.array(1).astype(bool)
expect(
    node,
    inputs=[trip_count, cond, seq_empty],
    outputs=[seq_res],
    name="test_loop13_seq",
    opset_imports=[onnx.helper.make_opsetid("", 13)],
    input_type_protos=[
        onnx.helper.make_tensor_type_proto(
            onnx.TensorProto.INT64, trip_count.shape
        ),
        onnx.helper.make_tensor_type_proto(onnx.TensorProto.BOOL, cond.shape),
        onnx.helper.make_sequence_type_proto(
            onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, [])
        ),
    ],
)
```

</details>


<details>
<summary>loop_16_none</summary>

```python
# Given a tensor sequence of values [x1, ..., xN], and an initial optional sequence of tensors [x0],
# Return a concatenated sequence of tensors of
#   [x0, [x1], [x1, x2], ..., [x1, ..., xN]]

ten_in_tp = onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, [])
seq_in_tp = onnx.helper.make_sequence_type_proto(ten_in_tp)
opt_in_tp = onnx.helper.make_optional_type_proto(seq_in_tp)
opt_in = onnx.helper.make_value_info("opt_seq_in", opt_in_tp)
seq_out = onnx.helper.make_tensor_sequence_value_info(
    "seq_out", onnx.TensorProto.FLOAT, []
)
cond_in = onnx.helper.make_tensor_value_info(
    "cond_in", onnx.TensorProto.BOOL, []
)
cond_out = onnx.helper.make_tensor_value_info(
    "cond_out", onnx.TensorProto.BOOL, []
)
iter_count = onnx.helper.make_tensor_value_info(
    "iter_count", onnx.TensorProto.INT64, []
)

x0 = np.array(0).astype(np.float32)
x = np.array([1, 2, 3, 4, 5]).astype(np.float32)

optional_has_elem_node = onnx.helper.make_node(
    "OptionalHasElement", inputs=["opt_seq_in"], outputs=["optional_has_elem"]
)

optional_is_none = onnx.helper.make_node(
    "Not", inputs=["optional_has_elem"], outputs=["optional_is_none"]
)

optional_get_elem = onnx.helper.make_node(
    "OptionalGetElement", inputs=["opt_seq_in"], outputs=["seq_in"]
)

constant_in = onnx.helper.make_node(
    "Constant",
    inputs=[],
    outputs=["constant_in"],
    value=onnx.helper.make_tensor(
        name="const_tensor", data_type=onnx.TensorProto.FLOAT, dims=(), vals=[0]
    ),
)

seq_const_in = onnx.helper.make_node(
    "SequenceConstruct", inputs=["constant_in"], outputs=["init_seq_in"]
)

then_seq_out = onnx.helper.make_tensor_sequence_value_info(
    "init_seq_in", onnx.TensorProto.FLOAT, []
)
then_body = onnx.helper.make_graph(
    [constant_in, seq_const_in], "then_body", [], [then_seq_out]
)

else_seq_out = onnx.helper.make_tensor_sequence_value_info(
    "seq_in", onnx.TensorProto.FLOAT, []
)
else_body = onnx.helper.make_graph(
    [optional_get_elem], "else_body", [], [else_seq_out]
)

if_node = onnx.helper.make_node(
    "If",
    inputs=["optional_is_none"],
    outputs=["sequence"],
    then_branch=then_body,
    else_branch=else_body,
)

x_const_node = onnx.helper.make_node(
    "Constant",
    inputs=[],
    outputs=["x"],
    value=onnx.helper.make_tensor(
        name="const_tensor_x",
        data_type=onnx.TensorProto.FLOAT,
        dims=x.shape,
        vals=x.flatten().astype(float),
    ),
)

one_const_node = onnx.helper.make_node(
    "Constant",
    inputs=[],
    outputs=["one"],
    value=onnx.helper.make_tensor(
        name="const_tensor_one",
        data_type=onnx.TensorProto.INT64,
        dims=(),
        vals=[1],
    ),
)

zero_const_node = onnx.helper.make_node(
    "Constant",
    inputs=[],
    outputs=["slice_start"],
    value=onnx.helper.make_tensor(
        name="const_tensor_zero",
        data_type=onnx.TensorProto.INT64,
        dims=(1,),
        vals=[0],
    ),
)

axes_node = onnx.helper.make_node(
    "Constant",
    inputs=[],
    outputs=["axes"],
    value=onnx.helper.make_tensor(
        name="const_tensor_axes",
        data_type=onnx.TensorProto.INT64,
        dims=(),
        vals=[0],
    ),
)

add_node = onnx.helper.make_node(
    "Add", inputs=["iter_count", "one"], outputs=["end"]
)

end_unsqueeze_node = onnx.helper.make_node(
    "Unsqueeze", inputs=["end", "axes"], outputs=["slice_end"]
)

slice_node = onnx.helper.make_node(
    "Slice", inputs=["x", "slice_start", "slice_end"], outputs=["slice_out"]
)

insert_node = onnx.helper.make_node(
    "SequenceInsert", inputs=["sequence", "slice_out"], outputs=["seq_out"]
)

identity_node = onnx.helper.make_node(
    "Identity", inputs=["cond_in"], outputs=["cond_out"]
)

loop_body = onnx.helper.make_graph(
    [
        identity_node,
        optional_has_elem_node,
        optional_is_none,
        if_node,
        x_const_node,
        one_const_node,
        zero_const_node,
        add_node,
        axes_node,
        end_unsqueeze_node,
        slice_node,
        insert_node,
    ],
    "loop_body",
    [iter_count, cond_in, opt_in],
    [cond_out, seq_out],
)

node = onnx.helper.make_node(
    "Loop",
    inputs=["trip_count", "cond", "opt_seq"],
    outputs=["seq_res"],
    body=loop_body,
)

trip_count = np.array(5).astype(np.int64)
cond = np.array(1).astype(bool)
seq_res = compute_loop_outputs(x, [x0], trip_count)
opt_seq_in: List[Any] = [x0]
expect(
    node,
    inputs=[trip_count, cond, opt_seq_in],
    outputs=[seq_res],
    name="test_loop16_seq_none",
    opset_imports=[onnx.helper.make_opsetid("", 16)],
    input_type_protos=[
        onnx.helper.make_tensor_type_proto(
            onnx.TensorProto.INT64, trip_count.shape
        ),
        onnx.helper.make_tensor_type_proto(onnx.TensorProto.BOOL, cond.shape),
        opt_in_tp,
    ],
)
```

</details>


### <a name="LpNormalization"></a><a name="lpnormalization">**LpNormalization**</a>

  Given a matrix, apply Lp-normalization along the provided axis.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int (default is -1)</dt>
<dd>The axis on which to apply normalization, -1 mean last axis.</dd>
<dt><tt>p</tt> : int (default is 2)</dt>
<dd>The order of the normalization, only 1 or 2 are supported.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>Input matrix</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>Matrix after normalization</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


### <a name="LpPool"></a><a name="lppool">**LpPool**</a>

  LpPool consumes an input tensor X and applies Lp pooling across
   the tensor according to kernel sizes, stride sizes, and pad lengths.
   Lp pooling consisting of computing the Lp norm on all values of a subset
   of the input tensor according to the kernel size and downsampling the
   data into the output tensor Y for further processing. The output spatial shape will be following:
   ```
   output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - {kernelSpatialShape}) / strides_spatial_shape[i] + 1)
   ```
   or
   ```
   output_spatial_shape[i] = ceil((input_spatial_shape[i] + pad_shape[i] - {kernelSpatialShape}) / strides_spatial_shape[i] + 1)
   ```
   if ceil_mode is enabled `pad_shape[i]` is the sum of pads along axis `i`.

   `auto_pad` is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following:
   ```
   VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - {kernelSpatialShape} + 1) / strides_spatial_shape[i])
   SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i])
   ```
   And pad shape will be following if `SAME_UPPER` or `SAME_LOWER`:
   ```
   pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + {kernelSpatialShape} - input_spatial_shape[i]
   ```

#### Version

This version of the operator has been available since version 18 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#LpPool-1">1</a>, <a href="Changelog.md#LpPool-2">2</a>, <a href="Changelog.md#LpPool-11">11</a>

#### Attributes

<dl>
<dt><tt>auto_pad</tt> : string (default is NOTSET)</dt>
<dd>auto_pad must be either NOTSET, SAME_UPPER, SAME_LOWER or VALID. Where default value is NOTSET, which means explicit padding is used. SAME_UPPER or SAME_LOWER mean pad the input so that `output_shape[i] = ceil(input_shape[i] / strides[i])` for each axis `i`. The padding is split between the two sides equally or almost equally (depending on whether it is even or odd). In case the padding is an odd number, the extra padding is added at the end for SAME_UPPER and at the beginning for SAME_LOWER.</dd>
<dt><tt>ceil_mode</tt> : int (default is 0)</dt>
<dd>Whether to use ceil or floor (default) to compute the output shape.</dd>
<dt><tt>dilations</tt> : list of ints</dt>
<dd>dilation value along each spatial axis of the filter. If not present, the dilation defaults is 1 along each spatial axis.</dd>
<dt><tt>kernel_shape</tt> : list of ints (required)</dt>
<dd>The size of the kernel along each axis.</dd>
<dt><tt>p</tt> : int (default is 2)</dt>
<dd>p value of the Lp norm used to pool over the input data.</dd>
<dt><tt>pads</tt> : list of ints</dt>
<dd>Padding for the beginning and ending along each spatial axis, it can take any value greater than or equal to 0. The value represent the number of pixels added to the beginning and end part of the corresponding axis. `pads` format should be as follow [x1_begin, x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels added at the beginning of axis `i` and xi_end, the number of pixels added at the end of axis `i`. This attribute cannot be used simultaneously with auto_pad attribute. If not present, the padding defaults to 0 along start and end of each spatial axis.</dd>
<dt><tt>strides</tt> : list of ints</dt>
<dd>Stride along each spatial axis. If not present, the stride defaults to 1 along each spatial axis.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimensions are in the form of (N x C x D1 x D2 ... Dn), where N is the batch size.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>Output data tensor from Lp pooling across the input tensor. Dimensions will vary based on various kernel, stride, and pad sizes.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>lppool_1d_default</summary>

```python
"""input_shape: [1, 3, 32]
output_shape: [1, 3, 31]
"""
p = 3
kernel_shape = [2]
strides = [1]
node = onnx.helper.make_node(
    "LpPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=kernel_shape,
    strides=strides,
    p=p,
)
x = np.random.randn(1, 3, 32).astype(np.float32)
x_shape = np.shape(x)
pads = None
out_shape, _ = get_output_shape_explicit_padding(
    pads, x_shape[2:], kernel_shape, strides
)
padded = x
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "LPPOOL", p=p)

expect(node, inputs=[x], outputs=[y], name="test_lppool_1d_default")
```

</details>


<details>
<summary>lppool_2d_default</summary>

```python
"""input_shape: [1, 3, 32, 32]
output_shape: [1, 3, 31, 31]
"""
p = 4
node = onnx.helper.make_node(
    "LpPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[2, 2],
    p=p,
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
x_shape = np.shape(x)
pads = None
kernel_shape = (2, 2)
strides = (1, 1)
out_shape, _ = get_output_shape_explicit_padding(
    pads, x_shape[2:], kernel_shape, strides
)
padded = x
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "LPPOOL", p=p)

expect(node, inputs=[x], outputs=[y], name="test_lppool_2d_default")
```

</details>


<details>
<summary>lppool_2d_dilations</summary>

```python
"""input_shape: [1, 1, 4, 4]
output_shape: [1, 1, 2, 2]
"""
p = 2
node = onnx.helper.make_node(
    "LpPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[2, 2],
    strides=[1, 1],
    dilations=[2, 2],
    p=p,
)
x = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ]
).astype(np.float32)

y = np.array(
    [
        [
            [
                [14.560219778561036, 16.24807680927192],
                [21.633307652783937, 23.49468024894146],
            ]
        ]
    ]
).astype(np.float32)

expect(node, inputs=[x], outputs=[y], name="test_lppool_2d_dilations")
```

</details>


<details>
<summary>lppool_2d_pads</summary>

```python
"""input_shape: [1, 3, 28, 28]
output_shape: [1, 3, 30, 30]
pad_shape: [4, 4] -> [2, 2, 2, 2] by axis
"""
p = 3
node = onnx.helper.make_node(
    "LpPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[3, 3],
    pads=[2, 2, 2, 2],
    p=p,
)
x = np.random.randn(1, 3, 28, 28).astype(np.float32)
x_shape = np.shape(x)
kernel_shape = (3, 3)
strides = (1, 1)
pad_bottom = pad_top = pad_right = pad_left = 2
pads = [pad_top, pad_left, pad_bottom, pad_right]
out_shape, pads = get_output_shape_explicit_padding(
    pads, x_shape[2:], kernel_shape, strides
)
padded = np.pad(
    x,
    ((0, 0), (0, 0), (pad_top, pad_bottom), (pad_left, pad_right)),
    mode="constant",
    constant_values=0,
)
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "LPPOOL", pads, p=p)

expect(node, inputs=[x], outputs=[y], name="test_lppool_2d_pads")
```

</details>


<details>
<summary>lppool_2d_same_lower</summary>

```python
"""input_shape: [1, 3, 32, 32]
output_shape: [1, 3, 32, 32]
pad_shape: [1, 1] -> [1, 0, 1, 0] by axis
"""
p = 4
node = onnx.helper.make_node(
    "LpPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[2, 2],
    auto_pad="SAME_LOWER",
    p=p,
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
x_shape = np.shape(x)
kernel_shape = (2, 2)
strides = (1, 1)
out_shape = get_output_shape_auto_pad(
    "SAME_LOWER", x_shape[2:], kernel_shape, strides
)
pad_shape = get_pad_shape(
    "SAME_LOWER", x_shape[2:], kernel_shape, strides, out_shape
)
pad_bottom = pad_shape[0] // 2
pad_top = pad_shape[0] - pad_bottom
pad_right = pad_shape[1] // 2
pad_left = pad_shape[1] - pad_right
padded = np.pad(
    x,
    ((0, 0), (0, 0), (pad_top, pad_bottom), (pad_left, pad_right)),
    mode="constant",
    constant_values=0,
)
pads = [pad_top, pad_left, pad_bottom, pad_right]
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "LPPOOL", pads, p=p)

expect(node, inputs=[x], outputs=[y], name="test_lppool_2d_same_lower")
```

</details>


<details>
<summary>lppool_2d_same_upper</summary>

```python
"""input_shape: [1, 3, 32, 32]
output_shape: [1, 3, 32, 32]
pad_shape: [1, 1] -> [0, 1, 0, 1] by axis
"""
p = 2
node = onnx.helper.make_node(
    "LpPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[2, 2],
    auto_pad="SAME_UPPER",
    p=p,
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
x_shape = np.shape(x)
kernel_shape = (2, 2)
strides = (1, 1)
out_shape = get_output_shape_auto_pad(
    "SAME_UPPER", x_shape[2:], kernel_shape, strides
)
pad_shape = get_pad_shape(
    "SAME_UPPER", x_shape[2:], kernel_shape, strides, out_shape
)
pad_top = pad_shape[0] // 2
pad_bottom = pad_shape[0] - pad_top
pad_left = pad_shape[1] // 2
pad_right = pad_shape[1] - pad_left
padded = np.pad(
    x,
    ((0, 0), (0, 0), (pad_top, pad_bottom), (pad_left, pad_right)),
    mode="constant",
    constant_values=0,
)
pads = [pad_top, pad_left, pad_bottom, pad_right]
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "LPPOOL", pads, p=p)

expect(node, inputs=[x], outputs=[y], name="test_lppool_2d_same_upper")
```

</details>


<details>
<summary>lppool_2d_strides</summary>

```python
"""input_shape: [1, 3, 32, 32]
output_shape: [1, 3, 10, 10]
"""
p = 2
node = onnx.helper.make_node(
    "LpPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[5, 5],
    strides=[3, 3],
    p=p,
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
x_shape = np.shape(x)
pads = None
kernel_shape = (5, 5)
strides = (3, 3)
out_shape, _ = get_output_shape_explicit_padding(
    pads, x_shape[2:], kernel_shape, strides
)
padded = x
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "LPPOOL", p=p)

expect(node, inputs=[x], outputs=[y], name="test_lppool_2d_strides")
```

</details>


<details>
<summary>lppool_3d_default</summary>

```python
"""input_shape: [1, 3, 32, 32, 32]
output_shape: [1, 3, 31, 31, 31]
"""
p = 3
node = onnx.helper.make_node(
    "LpPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[2, 2, 2],
    p=p,
)
x = np.random.randn(1, 3, 32, 32, 32).astype(np.float32)
x_shape = np.shape(x)
pads = None
kernel_shape = [2, 2, 2]
strides = [1, 1, 1]
out_shape, _ = get_output_shape_explicit_padding(
    pads, x_shape[2:], kernel_shape, strides
)
padded = x
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "LPPOOL", p=p)

expect(node, inputs=[x], outputs=[y], name="test_lppool_3d_default")
```

</details>


### <a name="MatMul"></a><a name="matmul">**MatMul**</a>

  Matrix product that behaves like numpy.matmul: https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.matmul.html

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#MatMul-1">1</a>, <a href="Changelog.md#MatMul-9">9</a>

#### Inputs

<dl>
<dt><tt>A</tt> (differentiable) : T</dt>
<dd>N-dimensional matrix A</dd>
<dt><tt>B</tt> (differentiable) : T</dt>
<dd>N-dimensional matrix B</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>Matrix multiply results from A * B</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double), tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(bfloat16)</dt>
<dd>Constrain input and output types to float/int tensors.</dd>
</dl>


#### Examples

<details>
<summary>matmul</summary>

```python
node = onnx.helper.make_node(
    "MatMul",
    inputs=["a", "b"],
    outputs=["c"],
)

# 2d
a = np.random.randn(3, 4).astype(np.float32)
b = np.random.randn(4, 3).astype(np.float32)
c = np.matmul(a, b)
expect(node, inputs=[a, b], outputs=[c], name="test_matmul_2d")

# 3d
a = np.random.randn(2, 3, 4).astype(np.float32)
b = np.random.randn(2, 4, 3).astype(np.float32)
c = np.matmul(a, b)
expect(node, inputs=[a, b], outputs=[c], name="test_matmul_3d")

# 4d
a = np.random.randn(1, 2, 3, 4).astype(np.float32)
b = np.random.randn(1, 2, 4, 3).astype(np.float32)
c = np.matmul(a, b)
expect(node, inputs=[a, b], outputs=[c], name="test_matmul_4d")
```

</details>


### <a name="MatMulInteger"></a><a name="matmulinteger">**MatMulInteger**</a>

  Matrix product that behaves like numpy.matmul: https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.matmul.html.
  The production MUST never overflow. The accumulation may overflow if and only if in 32 bits.

#### Version

This version of the operator has been available since version 10 of the default ONNX operator set.

#### Inputs (2 - 4)

<dl>
<dt><tt>A</tt> (non-differentiable) : T1</dt>
<dd>N-dimensional matrix A</dd>
<dt><tt>B</tt> (non-differentiable) : T2</dt>
<dd>N-dimensional matrix B</dd>
<dt><tt>a_zero_point</tt> (optional, non-differentiable) : T1</dt>
<dd>Zero point tensor for input 'A'. It's optional and default value is 0. It could be a scalar or N-D tensor. Scalar refers to per tensor quantization whereas N-D refers to per row quantization. If the input is 2D of shape [M, K] then zero point tensor may be an M element vector [zp_1, zp_2, ..., zp_M]. If the input is N-D tensor with shape [D1, D2, M, K] then zero point tensor may have shape [D1, D2, M, 1]. </dd>
<dt><tt>b_zero_point</tt> (optional, non-differentiable) : T2</dt>
<dd>Zero point tensor for input 'B'. It's optional and default value is 0. It could be a scalar or a N-D tensor, Scalar refers to per tensor quantization whereas N-D refers to per col quantization. If the input is 2D of shape [K, N] then zero point tensor may be an N element vector [zp_1, zp_2, ..., zp_N]. If the input is N-D tensor with shape [D1, D2, K, N] then zero point tensor may have shape [D1, D2, 1, N]. </dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (non-differentiable) : T3</dt>
<dd>Matrix multiply results from A * B</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(int8), tensor(uint8)</dt>
<dd>Constrain input A data type to 8-bit integer tensor.</dd>
<dt><tt>T2</tt> : tensor(int8), tensor(uint8)</dt>
<dd>Constrain input B data type to 8-bit integer tensor.</dd>
<dt><tt>T3</tt> : tensor(int32)</dt>
<dd>Constrain output Y data type as 32-bit integer tensor.</dd>
</dl>


#### Examples

<details>
<summary>matmulinteger</summary>

```python
node = onnx.helper.make_node(
    "MatMulInteger",
    inputs=["A", "B", "a_zero_point", "b_zero_point"],
    outputs=["Y"],
)

A = np.array(
    [
        [11, 7, 3],
        [10, 6, 2],
        [9, 5, 1],
        [8, 4, 0],
    ],
    dtype=np.uint8,
)

a_zero_point = np.array([12], dtype=np.uint8)

B = np.array(
    [
        [1, 4],
        [2, 5],
        [3, 6],
    ],
    dtype=np.uint8,
)

b_zero_point = np.array([0], dtype=np.uint8)

output = np.array(
    [
        [-38, -83],
        [-44, -98],
        [-50, -113],
        [-56, -128],
    ],
    dtype=np.int32,
)

expect(
    node,
    inputs=[A, B, a_zero_point, b_zero_point],
    outputs=[output],
    name="test_matmulinteger",
)
```

</details>


### <a name="Max"></a><a name="max">**Max**</a>

  Element-wise max of each of the input tensors (with Numpy-style broadcasting support).
  All inputs and outputs must have the same data type.
  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Max-1">1</a>, <a href="Changelog.md#Max-6">6</a>, <a href="Changelog.md#Max-8">8</a>, <a href="Changelog.md#Max-12">12</a>

#### Inputs (1 - &#8734;)

<dl>
<dt><tt>data_0</tt> (variadic, differentiable) : T</dt>
<dd>List of tensors for max.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>max</tt> (differentiable) : T</dt>
<dd>Output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to numeric tensors.</dd>
</dl>


#### Examples

<details>
<summary>max</summary>

```python
data_0 = np.array([3, 2, 1]).astype(np.float32)
data_1 = np.array([1, 4, 4]).astype(np.float32)
data_2 = np.array([2, 5, 3]).astype(np.float32)
result = np.array([3, 5, 4]).astype(np.float32)
node = onnx.helper.make_node(
    "Max",
    inputs=["data_0", "data_1", "data_2"],
    outputs=["result"],
)
expect(
    node,
    inputs=[data_0, data_1, data_2],
    outputs=[result],
    name="test_max_example",
)

node = onnx.helper.make_node(
    "Max",
    inputs=["data_0"],
    outputs=["result"],
)
expect(node, inputs=[data_0], outputs=[data_0], name="test_max_one_input")

result = np.maximum(data_0, data_1)
node = onnx.helper.make_node(
    "Max",
    inputs=["data_0", "data_1"],
    outputs=["result"],
)
expect(
    node, inputs=[data_0, data_1], outputs=[result], name="test_max_two_inputs"
)
```

</details>


<details>
<summary>max_all_numeric_types</summary>

```python
for op_dtype in all_numeric_dtypes:
    data_0 = np.array([3, 2, 1]).astype(op_dtype)
    data_1 = np.array([1, 4, 4]).astype(op_dtype)
    result = np.array([3, 4, 4]).astype(op_dtype)
    node = onnx.helper.make_node(
        "Max",
        inputs=["data_0", "data_1"],
        outputs=["result"],
    )
    expect(
        node,
        inputs=[data_0, data_1],
        outputs=[result],
        name=f"test_max_{np.dtype(op_dtype).name}",
    )
```

</details>


### <a name="MaxPool"></a><a name="maxpool">**MaxPool**</a>

  MaxPool consumes an input tensor X and applies max pooling across
   the tensor according to kernel sizes, stride sizes, and pad lengths.
   max pooling consisting of computing the max on all values of a
   subset of the input tensor according to the kernel size and downsampling the
   data into the output tensor Y for further processing. The output spatial shape is calculated differently
   depending on whether explicit padding is used, where pads is employed, or auto padding is used, where auto_pad is utilized.
   With explicit padding (https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html?highlight=maxpool#torch.nn.MaxPool2d):
   ```
   output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - dilation[i] * (kernel_shape[i] - 1) - 1) / strides_spatial_shape[i] + 1)
   ```
   or
   ```
   output_spatial_shape[i] = ceil((input_spatial_shape[i] + pad_shape[i] - dilation[i] * (kernel_shape[i] - 1) - 1) / strides_spatial_shape[i] + 1)
   ```
   if ceil_mode is enabled. `pad_shape[i]` is the sum of pads along axis `i`. Sliding windows that would start in the right padded region are ignored.

   `auto_pad` is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following when ceil_mode is enabled:
   ```
   VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) + 1) / strides_spatial_shape[i])
   SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i])
   ```
   or when ceil_mode is disabled (https://www.tensorflow.org/api_docs/python/tf/keras/layers/AveragePooling2D):
   ```
   VALID: output_spatial_shape[i] = floor((input_spatial_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1)) / strides_spatial_shape[i]) + 1
   SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = floor((input_spatial_shape[i] - 1) / strides_spatial_shape[i]) + 1
   ```
   And pad shape will be following if `SAME_UPPER` or `SAME_LOWER`:
   ```
   pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) - input_spatial_shape[i]
   ```
   The output of each pooling window is maximum number of elements exclude pad.


#### Version

This version of the operator has been available since version 12 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#MaxPool-1">1</a>, <a href="Changelog.md#MaxPool-8">8</a>, <a href="Changelog.md#MaxPool-10">10</a>, <a href="Changelog.md#MaxPool-11">11</a>

#### Attributes

<dl>
<dt><tt>auto_pad</tt> : string (default is NOTSET)</dt>
<dd>auto_pad must be either NOTSET, SAME_UPPER, SAME_LOWER or VALID. Where default value is NOTSET, which means explicit padding is used. SAME_UPPER or SAME_LOWER mean pad the input so that `output_shape[i] = ceil(input_shape[i] / strides[i])` for each axis `i`. The padding is split between the two sides equally or almost equally (depending on whether it is even or odd). In case the padding is an odd number, the extra padding is added at the end for SAME_UPPER and at the beginning for SAME_LOWER.</dd>
<dt><tt>ceil_mode</tt> : int (default is 0)</dt>
<dd>Whether to use ceil or floor (default) to compute the output shape.</dd>
<dt><tt>dilations</tt> : list of ints</dt>
<dd>Dilation value along each spatial axis of filter. If not present, the dilation defaults to 1 along each spatial axis.</dd>
<dt><tt>kernel_shape</tt> : list of ints (required)</dt>
<dd>The size of the kernel along each axis.</dd>
<dt><tt>pads</tt> : list of ints</dt>
<dd>Padding for the beginning and ending along each spatial axis, it can take any value greater than or equal to 0. The value represent the number of pixels added to the beginning and end part of the corresponding axis. `pads` format should be as follow [x1_begin, x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels added at the beginning of axis `i` and xi_end, the number of pixels added at the end of axis `i`. This attribute cannot be used simultaneously with auto_pad attribute. If not present, the padding defaults to 0 along start and end of each spatial axis.</dd>
<dt><tt>storage_order</tt> : int (default is 0)</dt>
<dd>The storage order of the tensor. 0 is row major, and 1 is column major. This attribute is used only to convert an n-tuple index value into a single integer value for producing the second output. </dd>
<dt><tt>strides</tt> : list of ints</dt>
<dd>Stride along each spatial axis. If not present, the stride defaults to 1 along each spatial axis.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimensions are in the form of (N x C x D1 x D2 ... Dn), where N is the batch size. Optionally, if dimension denotation is in effect, the operation expects the input data tensor to arrive with the dimension denotation of [DATA_BATCH, DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE ...].</dd>
</dl>

#### Outputs (1 - 2)

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>Output data tensor from average or max pooling across the input tensor. Dimensions will vary based on various kernel, stride, and pad sizes. Floor value of the dimension is used</dd>
<dt><tt>Indices</tt> (optional, non-differentiable) : I</dt>
<dd>Indices tensor from max pooling across the input tensor. The dimensions of indices are the same as output tensor. The values in indices of are the indices of the selected values during pooling. The indices are computed as flatten 1-D tensor, and the indices do not consider padding. So the values in indices are in [0, N x C x D1 x ... x Dn).</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double), tensor(int8), tensor(uint8)</dt>
<dd>Constrain input and output types to float and 8 bit tensors.</dd>
<dt><tt>I</tt> : tensor(int64)</dt>
<dd>Constrain index tensor to int64</dd>
</dl>


#### Examples

<details>
<summary>maxpool_1d_default</summary>

```python
"""input_shape: [1, 3, 32]
output_shape: [1, 3, 31]
"""
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[2],
)
x = np.random.randn(1, 3, 32).astype(np.float32)
x_shape = np.shape(x)
pads = None
kernel_shape = [2]
strides = [1]
out_shape, _ = get_output_shape_explicit_padding(
    pads, x_shape[2:], kernel_shape, strides
)
padded = x
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "MAX")

expect(node, inputs=[x], outputs=[y], name="test_maxpool_1d_default")
```

</details>


<details>
<summary>maxpool_2d_ceil</summary>

```python
"""input_shape: [1, 1, 4, 4]
output_shape: [1, 1, 2, 2]
"""
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[3, 3],
    strides=[2, 2],
    ceil_mode=True,
)
x = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ]
).astype(np.float32)
y = np.array([[[[11, 12], [15, 16]]]]).astype(np.float32)

expect(node, inputs=[x], outputs=[y], name="test_maxpool_2d_ceil")
```

</details>


<details>
<summary>maxpool_2d_ceil_output_size_reduce_by_one</summary>

```python
"""input_shape: [1, 1, 2, 2]
output_shape: [1, 1, 1, 1]
"""
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[1, 1],
    strides=[2, 2],
    ceil_mode=True,
)
x = np.array([[[[1, 2], [3, 4]]]]).astype(np.float32)
y = np.array([[[[1]]]]).astype(np.float32)

expect(
    node,
    inputs=[x],
    outputs=[y],
    name="test_maxpool_2d_ceil_output_size_reduce_by_one",
)
```

</details>


<details>
<summary>maxpool_2d_default</summary>

```python
"""input_shape: [1, 3, 32, 32]
output_shape: [1, 3, 31, 31]
"""
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[2, 2],
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
x_shape = np.shape(x)
pads = None
kernel_shape = (2, 2)
strides = (1, 1)
out_shape, _ = get_output_shape_explicit_padding(
    pads, x_shape[2:], kernel_shape, strides
)
padded = x
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "MAX")

expect(node, inputs=[x], outputs=[y], name="test_maxpool_2d_default")
```

</details>


<details>
<summary>maxpool_2d_dilations</summary>

```python
"""input_shape: [1, 1, 4, 4]
output_shape: [1, 1, 2, 2]
"""
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[2, 2],
    strides=[1, 1],
    dilations=[2, 2],
)
x = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ]
).astype(np.float32)
y = np.array([[[[11, 12], [15, 16]]]]).astype(np.float32)

expect(node, inputs=[x], outputs=[y], name="test_maxpool_2d_dilations")
```

</details>


<details>
<summary>maxpool_2d_pads</summary>

```python
"""input_shape: [1, 3, 28, 28]
output_shape: [1, 3, 30, 30]
pad_shape: [4, 4] -> [2, 2, 2, 2] by axis
"""
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[3, 3],
    pads=[2, 2, 2, 2],
)
x = np.random.randn(1, 3, 28, 28).astype(np.float32)
x_shape = np.shape(x)
kernel_shape = (3, 3)
strides = (1, 1)
pad_bottom = pad_top = pad_right = pad_left = 2
pads = [pad_top, pad_left, pad_bottom, pad_right]
out_shape, pads = get_output_shape_explicit_padding(
    pads, x_shape[2:], kernel_shape, strides
)
padded = np.pad(
    x,
    ((0, 0), (0, 0), (pad_top, pad_bottom), (pad_left, pad_right)),
    mode="constant",
    constant_values=np.nan,
)

y = pool(padded, x_shape, kernel_shape, strides, out_shape, "MAX", pads)

expect(node, inputs=[x], outputs=[y], name="test_maxpool_2d_pads")
```

</details>


<details>
<summary>maxpool_2d_precomputed_pads</summary>

```python
"""input_shape: [1, 1, 5, 5]
output_shape: [1, 1, 5, 5]
pad_shape: [4, 4] -> [2, 2, 2, 2] by axis
"""
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[5, 5],
    pads=[2, 2, 2, 2],
)
x = np.array(
    [
        [
            [
                [1, 2, 3, 4, 5],
                [6, 7, 8, 9, 10],
                [11, 12, 13, 14, 15],
                [16, 17, 18, 19, 20],
                [21, 22, 23, 24, 25],
            ]
        ]
    ]
).astype(np.float32)
y = np.array(
    [
        [
            [
                [13, 14, 15, 15, 15],
                [18, 19, 20, 20, 20],
                [23, 24, 25, 25, 25],
                [23, 24, 25, 25, 25],
                [23, 24, 25, 25, 25],
            ]
        ]
    ]
).astype(np.float32)

expect(node, inputs=[x], outputs=[y], name="test_maxpool_2d_precomputed_pads")
```

</details>


<details>
<summary>maxpool_2d_precomputed_same_upper</summary>

```python
"""input_shape: [1, 1, 5, 5]
output_shape: [1, 1, 3, 3]
pad_shape: [2, 2] -> [1, 1, 1, 1] by axis
"""
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[3, 3],
    strides=[2, 2],
    auto_pad="SAME_UPPER",
)
x = np.array(
    [
        [
            [
                [1, 2, 3, 4, 5],
                [6, 7, 8, 9, 10],
                [11, 12, 13, 14, 15],
                [16, 17, 18, 19, 20],
                [21, 22, 23, 24, 25],
            ]
        ]
    ]
).astype(np.float32)
y = np.array([[[[7, 9, 10], [17, 19, 20], [22, 24, 25]]]]).astype(np.float32)

expect(
    node, inputs=[x], outputs=[y], name="test_maxpool_2d_precomputed_same_upper"
)
```

</details>


<details>
<summary>maxpool_2d_precomputed_strides</summary>

```python
"""input_shape: [1, 1, 5, 5]
output_shape: [1, 1, 2, 2]
"""
node = onnx.helper.make_node(
    "MaxPool", inputs=["x"], outputs=["y"], kernel_shape=[2, 2], strides=[2, 2]
)
x = np.array(
    [
        [
            [
                [1, 2, 3, 4, 5],
                [6, 7, 8, 9, 10],
                [11, 12, 13, 14, 15],
                [16, 17, 18, 19, 20],
                [21, 22, 23, 24, 25],
            ]
        ]
    ]
).astype(np.float32)
y = np.array([[[[7, 9], [17, 19]]]]).astype(np.float32)

expect(
    node, inputs=[x], outputs=[y], name="test_maxpool_2d_precomputed_strides"
)
```

</details>


<details>
<summary>maxpool_2d_same_lower</summary>

```python
"""input_shape: [1, 3, 32, 32]
output_shape: [1, 3, 32, 32]
pad_shape: [1, 1] -> [1, 0, 1, 0] by axis
"""
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[2, 2],
    auto_pad="SAME_LOWER",
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
x_shape = np.shape(x)
kernel_shape = (2, 2)
strides = (1, 1)
out_shape = get_output_shape_auto_pad(
    "SAME_LOWER", x_shape[2:], kernel_shape, strides
)
pad_shape = get_pad_shape(
    "SAME_LOWER", x_shape[2:], kernel_shape, strides, out_shape
)
pad_bottom = pad_shape[0] // 2
pad_top = pad_shape[0] - pad_bottom
pad_right = pad_shape[1] // 2
pad_left = pad_shape[1] - pad_right
padded = np.pad(
    x,
    ((0, 0), (0, 0), (pad_top, pad_bottom), (pad_left, pad_right)),
    mode="constant",
    constant_values=np.nan,
)
pads = [pad_top, pad_left, pad_bottom, pad_right]
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "MAX", pads)

expect(node, inputs=[x], outputs=[y], name="test_maxpool_2d_same_lower")
```

</details>


<details>
<summary>maxpool_2d_same_upper</summary>

```python
"""input_shape: [1, 3, 32, 32]
output_shape: [1, 3, 32, 32]
pad_shape: [1, 1] -> [0, 1, 0, 1] by axis
"""
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[2, 2],
    auto_pad="SAME_UPPER",
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
x_shape = np.shape(x)
kernel_shape = (2, 2)
strides = (1, 1)
out_shape = get_output_shape_auto_pad(
    "SAME_UPPER", x_shape[2:], kernel_shape, strides
)
pad_shape = get_pad_shape(
    "SAME_UPPER", x_shape[2:], kernel_shape, strides, out_shape
)
pad_top = pad_shape[0] // 2
pad_bottom = pad_shape[0] - pad_top
pad_left = pad_shape[1] // 2
pad_right = pad_shape[1] - pad_left
padded = np.pad(
    x,
    ((0, 0), (0, 0), (pad_top, pad_bottom), (pad_left, pad_right)),
    mode="constant",
    constant_values=np.nan,
)
pads = [pad_top, pad_left, pad_bottom, pad_right]
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "MAX", pads)

expect(node, inputs=[x], outputs=[y], name="test_maxpool_2d_same_upper")
```

</details>


<details>
<summary>maxpool_2d_strides</summary>

```python
"""input_shape: [1, 3, 32, 32]
output_shape: [1, 3, 10, 10]
"""
node = onnx.helper.make_node(
    "MaxPool", inputs=["x"], outputs=["y"], kernel_shape=[5, 5], strides=[3, 3]
)
x = np.random.randn(1, 3, 32, 32).astype(np.float32)
x_shape = np.shape(x)
pads = None
kernel_shape = (5, 5)
strides = (3, 3)
out_shape, pads = get_output_shape_explicit_padding(
    pads, x_shape[2:], kernel_shape, strides
)
padded = x
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "MAX")

expect(node, inputs=[x], outputs=[y], name="test_maxpool_2d_strides")
```

</details>


<details>
<summary>maxpool_2d_uint8</summary>

```python
"""input_shape: [1, 1, 5, 5]
output_shape: [1, 1, 5, 5]
pad_shape: [4, 4] -> [2, 2, 2, 2] by axis
"""
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[5, 5],
    pads=[2, 2, 2, 2],
)
x = np.array(
    [
        [
            [
                [1, 2, 3, 4, 5],
                [6, 7, 8, 9, 10],
                [11, 12, 13, 14, 15],
                [16, 17, 18, 19, 20],
                [21, 22, 23, 24, 25],
            ]
        ]
    ]
).astype(np.uint8)
y = np.array(
    [
        [
            [
                [13, 14, 15, 15, 15],
                [18, 19, 20, 20, 20],
                [23, 24, 25, 25, 25],
                [23, 24, 25, 25, 25],
                [23, 24, 25, 25, 25],
            ]
        ]
    ]
).astype(np.uint8)

expect(node, inputs=[x], outputs=[y], name="test_maxpool_2d_uint8")
```

</details>


<details>
<summary>maxpool_3d_default</summary>

```python
"""input_shape: [1, 3, 32, 32, 32]
output_shape: [1, 3, 31, 31, 31]
"""
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[2, 2, 2],
)
x = np.random.randn(1, 3, 32, 32, 32).astype(np.float32)
x_shape = np.shape(x)
pads = None
kernel_shape = [2, 2, 2]
strides = [1, 1, 1]
out_shape, _ = get_output_shape_explicit_padding(
    pads, x_shape[2:], kernel_shape, strides
)
padded = x
y = pool(padded, x_shape, kernel_shape, strides, out_shape, "MAX")

expect(node, inputs=[x], outputs=[y], name="test_maxpool_3d_default")
```

</details>


<details>
<summary>maxpool_3d_dilations</summary>

```python
"""input_shape: [1, 1, 4, 4, 4]
output_shape: [1, 1, 2, 2, 2]
"""
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[2, 2, 2],
    strides=[1, 1, 1],
    dilations=[2, 2, 2],
)
x = np.array(
    [
        [
            [
                [
                    [1, 2, 3, 4],
                    [5, 6, 7, 8],
                    [9, 10, 11, 12],
                    [13, 14, 15, 16],
                ],
                [
                    [1, 2, 3, 4],
                    [5, 6, 7, 8],
                    [9, 10, 11, 12],
                    [13, 14, 15, 16],
                ],
                [
                    [1, 2, 3, 4],
                    [5, 6, 7, 8],
                    [9, 10, 11, 12],
                    [13, 14, 15, 16],
                ],
                [
                    [1, 2, 3, 4],
                    [5, 6, 7, 8],
                    [9, 10, 11, 12],
                    [13, 14, 15, 16],
                ],
            ]
        ]
    ]
).astype(np.float32)
y = np.array([[[[[11, 12], [15, 16]], [[11, 12], [15, 16]]]]]).astype(
    np.float32
)

expect(node, inputs=[x], outputs=[y], name="test_maxpool_3d_dilations")
```

</details>


<details>
<summary>maxpool_3d_dilations_use_ref_impl</summary>

```python
"""input_shape: [1, 1, 4, 4, 4]
output_shape: [1, 1, 2, 2, 2]
"""
dilations = [2, 2, 2]
kernel_shape = [2, 2, 2]
strides = [1, 1, 1]
ceil_mode = False
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=[2, 2, 2],
    strides=[1, 1, 1],
    dilations=dilations,
)
x = np.array(
    [
        [
            [
                [
                    [1, 2, 3, 4],
                    [5, 6, 7, 8],
                    [9, 10, 11, 12],
                    [13, 14, 15, 16],
                ],
                [
                    [1, 2, 3, 4],
                    [5, 6, 7, 8],
                    [9, 10, 11, 12],
                    [13, 14, 15, 16],
                ],
                [
                    [1, 2, 3, 4],
                    [5, 6, 7, 8],
                    [9, 10, 11, 12],
                    [13, 14, 15, 16],
                ],
                [
                    [1, 2, 3, 4],
                    [5, 6, 7, 8],
                    [9, 10, 11, 12],
                    [13, 14, 15, 16],
                ],
            ]
        ]
    ]
).astype(np.float32)

x_shape = x.shape[2:]
out_shape, pads = get_output_shape_explicit_padding(
    None, x_shape, kernel_shape, strides, dilations, ceil_mode=ceil_mode
)
padded = x
y = pool(
    padded,
    (1, 1, *x_shape),
    kernel_shape,
    strides,
    out_shape,
    "MAX",
    pads,
    dilations=dilations,
)

expect(
    node, inputs=[x], outputs=[y], name="test_maxpool_3d_dilations_use_ref_impl"
)
```

</details>


<details>
<summary>maxpool_3d_dilations_use_ref_impl_large</summary>

```python
x_shape = (32, 32, 32)
dilations = (2, 2, 2)
kernel_shape = (5, 5, 5)
strides = (3, 3, 3)
ceil_mode = True

node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y"],
    kernel_shape=kernel_shape,
    strides=strides,
    dilations=dilations,
    ceil_mode=ceil_mode,
)

x = np.random.randn(1, 1, *x_shape).astype(np.float32)
out_shape, pads = get_output_shape_explicit_padding(
    None, x_shape, kernel_shape, strides, dilations, ceil_mode=ceil_mode
)
padded = np.pad(
    x,
    (
        (0, 0),
        (0, 0),
        (pads[0], pads[3]),
        (pads[1], pads[4]),
        (pads[2], pads[5]),
    ),
    mode="constant",
    constant_values=0,
)
y = pool(
    padded,
    (1, 1, *x_shape),
    kernel_shape,
    strides,
    out_shape,
    "MAX",
    pads,
    dilations=dilations,
)

expect(
    node,
    inputs=[x],
    outputs=[y],
    name="test_maxpool_3d_dilations_use_ref_impl_large",
)
```

</details>


<details>
<summary>maxpool_with_argmax_2d_precomputed_pads</summary>

```python
"""input_shape: [1, 1, 5, 5]
output_shape: [1, 1, 5, 5]
pad_shape: [4, 4] -> [2, 2, 2, 2] by axis
"""
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y", "z"],
    kernel_shape=[5, 5],
    pads=[2, 2, 2, 2],
)
x = np.array(
    [
        [
            [
                [1, 2, 3, 4, 5],
                [6, 7, 8, 9, 10],
                [11, 12, 13, 14, 15],
                [16, 17, 18, 19, 20],
                [21, 22, 23, 24, 25],
            ]
        ]
    ]
).astype(np.float32)
y = np.array(
    [
        [
            [
                [13, 14, 15, 15, 15],
                [18, 19, 20, 20, 20],
                [23, 24, 25, 25, 25],
                [23, 24, 25, 25, 25],
                [23, 24, 25, 25, 25],
            ]
        ]
    ]
).astype(np.float32)
z = np.array(
    [
        [
            [
                [12, 13, 14, 14, 14],
                [17, 18, 19, 19, 19],
                [22, 23, 24, 24, 24],
                [22, 23, 24, 24, 24],
                [22, 23, 24, 24, 24],
            ]
        ]
    ]
).astype(np.int64)

expect(
    node,
    inputs=[x],
    outputs=[y, z],
    name="test_maxpool_with_argmax_2d_precomputed_pads",
)
```

</details>


<details>
<summary>maxpool_with_argmax_2d_precomputed_strides</summary>

```python
"""input_shape: [1, 1, 5, 5]
output_shape: [1, 1, 2, 2]
"""
node = onnx.helper.make_node(
    "MaxPool",
    inputs=["x"],
    outputs=["y", "z"],
    kernel_shape=[2, 2],
    strides=[2, 2],
    storage_order=1,
)
x = np.array(
    [
        [
            [
                [1, 2, 3, 4, 5],
                [6, 7, 8, 9, 10],
                [11, 12, 13, 14, 15],
                [16, 17, 18, 19, 20],
                [21, 22, 23, 24, 25],
            ]
        ]
    ]
).astype(np.float32)
y = np.array([[[[7, 9], [17, 19]]]]).astype(np.float32)
z = np.array([[[[6, 16], [8, 18]]]]).astype(np.int64)

expect(
    node,
    inputs=[x],
    outputs=[y, z],
    name="test_maxpool_with_argmax_2d_precomputed_strides",
)
```

</details>


### <a name="MaxRoiPool"></a><a name="maxroipool">**MaxRoiPool**</a>

  ROI max pool consumes an input tensor X and region of interests (RoIs) to
   apply max pooling across each RoI, to produce output 4-D tensor of shape
   (num_rois, channels, pooled_shape[0], pooled_shape[1]).

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>pooled_shape</tt> : list of ints (required)</dt>
<dd>ROI pool output shape (height, width).</dd>
<dt><tt>spatial_scale</tt> : float (default is 1.0)</dt>
<dd>Multiplicative spatial scale factor to translate ROI coordinates from their input scale to the scale used when pooling.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data.</dd>
<dt><tt>rois</tt> (non-differentiable) : T</dt>
<dd>RoIs (Regions of Interest) to pool over. Should be a 2-D tensor of shape (num_rois, 5) given as [[batch_id, x1, y1, x2, y2], ...].</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>RoI pooled output 4-D tensor of shape (num_rois, channels, pooled_shape[0], pooled_shape[1]).</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


### <a name="MaxUnpool"></a><a name="maxunpool">**MaxUnpool**</a>

  MaxUnpool essentially computes the partial inverse of the MaxPool op.
   The input information to this op is typically the output information from a MaxPool op. The first
   input tensor X is the tensor that needs to be unpooled, which is typically the pooled tensor (first output)
   from MaxPool. The second input tensor, I, contains the indices to the (locally maximal) elements corresponding
   to the elements in the first input tensor X. Input tensor I is typically the second output of the MaxPool op.
   The third (optional) input is a tensor that specifies the output size of the unpooling operation.

  MaxUnpool is intended to do 'partial' inverse of the MaxPool op. 'Partial' because all the non-maximal
   values from the original input to MaxPool are set to zero in the output of the MaxUnpool op. Pooling
   the result of an unpooling operation should give back the original input to the unpooling op.

  MaxUnpool can produce the same output size for several input sizes, which makes unpooling op ambiguous.
   The third input argument, output_size, is meant to disambiguate the op and produce output tensor of
   known/predictable size.

  In addition to the inputs, MaxUnpool takes three attributes, namely kernel_shape, strides, and pads,
   which define the exact unpooling op. The attributes typically have the same values as the corresponding
   pooling op that the unpooling op is trying to invert.

#### Version

This version of the operator has been available since version 11 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#MaxUnpool-9">9</a>

#### Attributes

<dl>
<dt><tt>kernel_shape</tt> : list of ints (required)</dt>
<dd>The size of the kernel along each axis.</dd>
<dt><tt>pads</tt> : list of ints</dt>
<dd>Padding for the beginning and ending along each spatial axis, it can take any value greater than or equal to 0. The value represent the number of pixels added to the beginning and end part of the corresponding axis. `pads` format should be as follow [x1_begin, x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels added at the beginning of axis `i` and xi_end, the number of pixels added at the end of axis `i`. This attribute cannot be used simultaneously with auto_pad attribute. If not present, the padding defaults to 0 along start and end of each spatial axis.</dd>
<dt><tt>strides</tt> : list of ints</dt>
<dd>Stride along each spatial axis. If not present, the stride defaults to 1 along each spatial axis.</dd>
</dl>

#### Inputs (2 - 3)

<dl>
<dt><tt>X</tt> (differentiable) : T1</dt>
<dd>Input data tensor that has to be unpooled. This tensor is typically the first output of the MaxPool op.Dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non-image case, the dimensions are in the form of (N x C x D1 x D2 ... Dn), where N is the batch size. Optionally, if dimension denotation is in effect, the operation expects the input data tensor to arrive with the dimension denotation of [DATA_BATCH, DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE ...].</dd>
<dt><tt>I</tt> (non-differentiable) : T2</dt>
<dd>Input data tensor containing the indices corresponding to elements in the first input tensor X.This tensor is typically the second output of the MaxPool op.Dimensions must be the same as input tensor X. The indices are linear, i.e. computed considering the tensor as flattened 1-D tensor, assuming row-major storage. Also, the linear indices should not consider padding. So the values in indices are in the range [0, N x C x D1 x ... x Dn).</dd>
<dt><tt>output_shape</tt> (optional, non-differentiable) : T2</dt>
<dd>The shape of the output can be explicitly set which will cause pads values to be auto generated. If 'output_shape' is specified, 'pads' values are ignored.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T1</dt>
<dd>Output data tensor that contains the result of the unpooling.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
<dt><tt>T2</tt> : tensor(int64)</dt>
<dd>Constrain index tensor to int64</dd>
</dl>


#### Examples

<details>
<summary>with_output_shape</summary>

```python
node = onnx.helper.make_node(
    "MaxUnpool",
    inputs=["xT", "xI", "output_shape"],
    outputs=["y"],
    kernel_shape=[2, 2],
    strides=[2, 2],
)
xT = np.array([[[[5, 6], [7, 8]]]], dtype=np.float32)
xI = np.array([[[[5, 7], [13, 15]]]], dtype=np.int64)
output_shape = np.array((1, 1, 5, 5), dtype=np.int64)
y = np.array(
    [
        [
            [
                [0, 0, 0, 0, 0],
                [0, 5, 0, 6, 0],
                [0, 0, 0, 0, 0],
                [0, 7, 0, 8, 0],
                [0, 0, 0, 0, 0],
            ]
        ]
    ],
    dtype=np.float32,
)
expect(
    node,
    inputs=[xT, xI, output_shape],
    outputs=[y],
    name="test_maxunpool_export_with_output_shape",
)
```

</details>


<details>
<summary>without_output_shape</summary>

```python
node = onnx.helper.make_node(
    "MaxUnpool",
    inputs=["xT", "xI"],
    outputs=["y"],
    kernel_shape=[2, 2],
    strides=[2, 2],
)
xT = np.array([[[[1, 2], [3, 4]]]], dtype=np.float32)
xI = np.array([[[[5, 7], [13, 15]]]], dtype=np.int64)
y = np.array(
    [[[[0, 0, 0, 0], [0, 1, 0, 2], [0, 0, 0, 0], [0, 3, 0, 4]]]],
    dtype=np.float32,
)
expect(
    node,
    inputs=[xT, xI],
    outputs=[y],
    name="test_maxunpool_export_without_output_shape",
)
```

</details>


### <a name="Mean"></a><a name="mean">**Mean**</a>

  Element-wise mean of each of the input tensors (with Numpy-style broadcasting support).
  All inputs and outputs must have the same data type.
  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Mean-1">1</a>, <a href="Changelog.md#Mean-6">6</a>, <a href="Changelog.md#Mean-8">8</a>

#### Inputs (1 - &#8734;)

<dl>
<dt><tt>data_0</tt> (variadic, differentiable) : T</dt>
<dd>List of tensors for mean.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>mean</tt> (differentiable) : T</dt>
<dd>Output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>mean</summary>

```python
data_0 = np.array([3, 0, 2]).astype(np.float32)
data_1 = np.array([1, 3, 4]).astype(np.float32)
data_2 = np.array([2, 6, 6]).astype(np.float32)
result = np.array([2, 3, 4]).astype(np.float32)
node = onnx.helper.make_node(
    "Mean",
    inputs=["data_0", "data_1", "data_2"],
    outputs=["result"],
)
expect(
    node,
    inputs=[data_0, data_1, data_2],
    outputs=[result],
    name="test_mean_example",
)

node = onnx.helper.make_node(
    "Mean",
    inputs=["data_0"],
    outputs=["result"],
)
expect(node, inputs=[data_0], outputs=[data_0], name="test_mean_one_input")

result = np.divide(np.add(data_0, data_1), 2.0)
node = onnx.helper.make_node(
    "Mean",
    inputs=["data_0", "data_1"],
    outputs=["result"],
)
expect(
    node, inputs=[data_0, data_1], outputs=[result], name="test_mean_two_inputs"
)
```

</details>


### <a name="MeanVarianceNormalization"></a><a name="meanvariancenormalization">**MeanVarianceNormalization**</a>

  A MeanVarianceNormalization Function: Perform mean variance normalization
        on the input tensor X using formula: `(X-EX)/sqrt(E(X-EX)^2)`

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#MeanVarianceNormalization-9">9</a>

#### Attributes

<dl>
<dt><tt>axes</tt> : list of ints (default is ['0', '2', '3'])</dt>
<dd>A list of integers, along which to reduce. The default is to calculate along axes [0,2,3] for calculating mean and variance along each channel. Two variables with the same C-coordinate are associated with the same mean and variance.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to all numeric tensors.</dd>
</dl>


#### Examples

<details>
<summary>meanvariancenormalization</summary>

```python
node = onnx.helper.make_node(
    "MeanVarianceNormalization", inputs=["X"], outputs=["Y"]
)

input_data = np.array(
    [
        [
            [[0.8439683], [0.5665144], [0.05836735]],
            [[0.02916367], [0.12964272], [0.5060197]],
            [[0.79538304], [0.9411346], [0.9546573]],
        ],
        [
            [[0.17730942], [0.46192095], [0.26480448]],
            [[0.6746842], [0.01665257], [0.62473077]],
            [[0.9240844], [0.9722341], [0.11965699]],
        ],
        [
            [[0.41356155], [0.9129373], [0.59330076]],
            [[0.81929934], [0.7862604], [0.11799799]],
            [[0.69248444], [0.54119414], [0.07513223]],
        ],
    ],
    dtype=np.float32,
)

# Calculate expected output data
data_mean = np.mean(input_data, axis=(0, 2, 3), keepdims=1)
data_mean_squared = np.power(data_mean, 2)
data_squared = np.power(input_data, 2)
data_squared_mean = np.mean(data_squared, axis=(0, 2, 3), keepdims=1)
std = np.sqrt(data_squared_mean - data_mean_squared)
expected_output = (input_data - data_mean) / (std + 1e-9)

expect(node, inputs=[input_data], outputs=[expected_output], name="test_mvn")
```

</details>


### <a name="MelWeightMatrix"></a><a name="melweightmatrix">**MelWeightMatrix**</a>

  Generate a MelWeightMatrix that can be used to re-weight a Tensor containing a linearly sampled frequency spectra (from DFT or STFT) into num_mel_bins frequency information based on the [lower_edge_hertz, upper_edge_hertz] range on the mel scale.
  This function defines the mel scale in terms of a frequency in hertz according to the following formula:

      mel(f) = 2595 * log10(1 + f/700)

  In the returned matrix, all the triangles (filterbanks) have a peak value of 1.0.

  The returned MelWeightMatrix can be used to right-multiply a spectrogram S of shape [frames, num_spectrogram_bins] of linear scale spectrum values (e.g. STFT magnitudes) to generate a "mel spectrogram" M of shape [frames, num_mel_bins].

#### Version

This version of the operator has been available since version 17 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>output_datatype</tt> : int (default is 1)</dt>
<dd>The data type of the output tensor. Strictly must be one of the values from DataType enum in TensorProto whose values correspond to T3. The default value is 1 = FLOAT. </dd>
</dl>

#### Inputs

<dl>
<dt><tt>num_mel_bins</tt> (non-differentiable) : T1</dt>
<dd>The number of bands in the mel spectrum.</dd>
<dt><tt>dft_length</tt> (non-differentiable) : T1</dt>
<dd>The size of the original DFT. The size of the original DFT is used to infer the size of the onesided DFT, which is understood to be floor(dft_length/2) + 1, i.e. the spectrogram only contains the nonredundant DFT bins.</dd>
<dt><tt>sample_rate</tt> (non-differentiable) : T1</dt>
<dd>Samples per second of the input signal used to create the spectrogram. Used to figure out the frequencies corresponding to each spectrogram bin, which dictates how they are mapped into the mel scale.</dd>
<dt><tt>lower_edge_hertz</tt> (non-differentiable) : T2</dt>
<dd>Lower bound on the frequencies to be included in the mel spectrum. This corresponds to the lower edge of the lowest triangular band.</dd>
<dt><tt>upper_edge_hertz</tt> (non-differentiable) : T2</dt>
<dd>The desired top edge of the highest frequency band.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (non-differentiable) : T3</dt>
<dd>The Mel Weight Matrix. The output has the shape: [floor(dft_length/2) + 1][num_mel_bins].</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(int32), tensor(int64)</dt>
<dd>Constrain to integer tensors.</dd>
<dt><tt>T2</tt> : tensor(float), tensor(float16), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain to float tensors</dd>
<dt><tt>T3</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain to any numerical types.</dd>
</dl>


#### Examples

<details>
<summary>melweightmatrix</summary>

```python
node = onnx.helper.make_node(
    "MelWeightMatrix",
    inputs=[
        "num_mel_bins",
        "dft_length",
        "sample_rate",
        "lower_edge_hertz",
        "upper_edge_hertz",
    ],
    outputs=["output"],
)

num_mel_bins = np.int32(8)
dft_length = np.int32(16)
sample_rate = np.int32(8192)
lower_edge_hertz = np.float32(0)
upper_edge_hertz = np.float32(8192 / 2)

num_spectrogram_bins = dft_length // 2 + 1
frequency_bins = np.arange(0, num_mel_bins + 2)

low_frequency_mel = 2595 * np.log10(1 + lower_edge_hertz / 700)
high_frequency_mel = 2595 * np.log10(1 + upper_edge_hertz / 700)
mel_step = (high_frequency_mel - low_frequency_mel) / frequency_bins.shape[0]

frequency_bins = frequency_bins * mel_step + low_frequency_mel
frequency_bins = 700 * (np.power(10, (frequency_bins / 2595)) - 1)
frequency_bins = ((dft_length + 1) * frequency_bins) // sample_rate
frequency_bins = frequency_bins.astype(int)

output = np.zeros((num_spectrogram_bins, num_mel_bins))
output.flags.writeable = True

for i in range(num_mel_bins):
    lower_frequency_value = frequency_bins[i]  # left
    center_frequency_point = frequency_bins[i + 1]  # center
    higher_frequency_point = frequency_bins[i + 2]  # right
    low_to_center = center_frequency_point - lower_frequency_value
    if low_to_center == 0:
        output[center_frequency_point, i] = 1
    else:
        for j in range(lower_frequency_value, center_frequency_point + 1):
            output[j, i] = float(j - lower_frequency_value) / float(
                low_to_center
            )
    center_to_high = higher_frequency_point - center_frequency_point
    if center_to_high > 0:
        for j in range(center_frequency_point, higher_frequency_point):
            output[j, i] = float(higher_frequency_point - j) / float(
                center_to_high
            )

# Expected output
# 1.000000, 1.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000,
# 0.000000, 0.000000, 1.000000, 1.000000, 0.000000, 0.000000, 0.000000, 0.000000,
# 0.000000, 0.000000, 0.000000, 0.000000, 1.000000, 0.000000, 0.000000, 0.000000,
# 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 1.000000, 0.000000, 0.000000,
# 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 1.000000, 0.000000,
# 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 1.000000,
# 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000,
# 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000,
# 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000,
output = output.astype(np.float32)
expect(
    node,
    inputs=[
        num_mel_bins,
        dft_length,
        sample_rate,
        lower_edge_hertz,
        upper_edge_hertz,
    ],
    outputs=[output],
    name="test_melweightmatrix",
)
```

</details>


### <a name="Min"></a><a name="min">**Min**</a>

  Element-wise min of each of the input tensors (with Numpy-style broadcasting support).
  All inputs and outputs must have the same data type.
  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Min-1">1</a>, <a href="Changelog.md#Min-6">6</a>, <a href="Changelog.md#Min-8">8</a>, <a href="Changelog.md#Min-12">12</a>

#### Inputs (1 - &#8734;)

<dl>
<dt><tt>data_0</tt> (variadic, differentiable) : T</dt>
<dd>List of tensors for min.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>min</tt> (differentiable) : T</dt>
<dd>Output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to numeric tensors.</dd>
</dl>


#### Examples

<details>
<summary>min</summary>

```python
data_0 = np.array([3, 2, 1]).astype(np.float32)
data_1 = np.array([1, 4, 4]).astype(np.float32)
data_2 = np.array([2, 5, 0]).astype(np.float32)
result = np.array([1, 2, 0]).astype(np.float32)
node = onnx.helper.make_node(
    "Min",
    inputs=["data_0", "data_1", "data_2"],
    outputs=["result"],
)
expect(
    node,
    inputs=[data_0, data_1, data_2],
    outputs=[result],
    name="test_min_example",
)

node = onnx.helper.make_node(
    "Min",
    inputs=["data_0"],
    outputs=["result"],
)
expect(node, inputs=[data_0], outputs=[data_0], name="test_min_one_input")

result = np.minimum(data_0, data_1)
node = onnx.helper.make_node(
    "Min",
    inputs=["data_0", "data_1"],
    outputs=["result"],
)
expect(
    node, inputs=[data_0, data_1], outputs=[result], name="test_min_two_inputs"
)
```

</details>


<details>
<summary>min_all_numeric_types</summary>

```python
for op_dtype in all_numeric_dtypes:
    data_0 = np.array([3, 2, 1]).astype(op_dtype)
    data_1 = np.array([1, 4, 4]).astype(op_dtype)
    result = np.array([1, 2, 1]).astype(op_dtype)
    node = onnx.helper.make_node(
        "Min",
        inputs=["data_0", "data_1"],
        outputs=["result"],
    )
    expect(
        node,
        inputs=[data_0, data_1],
        outputs=[result],
        name=f"test_min_{np.dtype(op_dtype).name}",
    )
```

</details>


### <a name="Mish"></a><a name="mish">**Mish**</a>

  Mish: A Self Regularized Non-Monotonic Neural Activation Function.

  Perform the linear unit element-wise on the input tensor X using formula:

  ```
  mish(x) = x * tanh(softplus(x)) = x * tanh(ln(1 + e^{x}))
  ```

#### Version

This version of the operator has been available since version 18 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input X and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>mish</summary>

```python
node = onnx.helper.make_node("Mish", inputs=["X"], outputs=["Y"])

input_data = np.linspace(-10, 10, 10000, dtype=np.float32)

# Calculate expected output data
expected_output = input_data * np.tanh(np.log1p(np.exp(input_data)))

expect(node, inputs=[input_data], outputs=[expected_output], name="test_mish")
```

</details>


### <a name="Mod"></a><a name="mod">**Mod**</a>

  Performs element-wise binary modulus (with Numpy-style broadcasting support).
    The sign of the remainder is the same as that of the Divisor.

    Mod operator can also behave like C fmod() or numpy.fmod. In this case, the sign of the remainder however, will be the same as the Dividend
    (in contrast to integer mod). To force a behavior like numpy.fmod() an 'fmod' Attribute is provided.
    This attribute is set to 0 by default causing the behavior to be like integer mod.
    Setting this attribute to 1 causes the remainder to be calculated similar to that of numpy.fmod().

    If the input type is floating point, then `fmod` attribute must be set to 1.

    In case of dividend being zero, the results will be platform dependent.

    This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Mod-10">10</a>

#### Attributes

<dl>
<dt><tt>fmod</tt> : int (default is 0)</dt>
<dd>Whether the operator should behave like fmod (default=0 meaning it will do integer mods); Set this to 1 to force fmod treatment</dd>
</dl>

#### Inputs

<dl>
<dt><tt>A</tt> (differentiable) : T</dt>
<dd>Dividend tensor</dd>
<dt><tt>B</tt> (non-differentiable) : T</dt>
<dd>Divisor tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> (differentiable) : T</dt>
<dd>Remainder tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to high-precision numeric tensors.</dd>
</dl>


#### Examples

<details>
<summary>mod_broadcast</summary>

```python
node = onnx.helper.make_node(
    "Mod",
    inputs=["x", "y"],
    outputs=["z"],
)

x = np.arange(0, 30).reshape([3, 2, 5]).astype(np.int32)
y = np.array([7]).astype(np.int32)
z = np.mod(x, y)
#   array([[[0, 1, 2, 3, 4],
#     [5, 6, 0, 1, 2]],

#    [[3, 4, 5, 6, 0],
#     [1, 2, 3, 4, 5]],

#    [[6, 0, 1, 2, 3],
#     [4, 5, 6, 0, 1]]], dtype=int32)
expect(node, inputs=[x, y], outputs=[z], name="test_mod_broadcast")
```

</details>


<details>
<summary>mod_int64_fmod</summary>

```python
node = onnx.helper.make_node("Mod", inputs=["x", "y"], outputs=["z"], fmod=1)

x = np.array([-4, 7, 5, 4, -7, 8]).astype(np.int64)
y = np.array([2, -3, 8, -2, 3, 5]).astype(np.int64)
z = np.fmod(x, y)  # expected output [ 0,  1,  5,  0, -1,  3]
expect(node, inputs=[x, y], outputs=[z], name="test_mod_int64_fmod")
```

</details>


<details>
<summary>mod_mixed_sign_float16</summary>

```python
node = onnx.helper.make_node("Mod", inputs=["x", "y"], outputs=["z"], fmod=1)

x = np.array([-4.3, 7.2, 5.0, 4.3, -7.2, 8.0]).astype(np.float16)
y = np.array([2.1, -3.4, 8.0, -2.1, 3.4, 5.0]).astype(np.float16)
z = np.fmod(
    x, y
)  # expected output [-0.10156, 0.3984 , 5. , 0.10156, -0.3984 ,  3.]
expect(node, inputs=[x, y], outputs=[z], name="test_mod_mixed_sign_float16")
```

</details>


<details>
<summary>mod_mixed_sign_float32</summary>

```python
node = onnx.helper.make_node("Mod", inputs=["x", "y"], outputs=["z"], fmod=1)

x = np.array([-4.3, 7.2, 5.0, 4.3, -7.2, 8.0]).astype(np.float32)
y = np.array([2.1, -3.4, 8.0, -2.1, 3.4, 5.0]).astype(np.float32)
z = np.fmod(
    x, y
)  # expected output [-0.10000038, 0.39999962, 5. , 0.10000038, -0.39999962, 3.]
expect(node, inputs=[x, y], outputs=[z], name="test_mod_mixed_sign_float32")
```

</details>


<details>
<summary>mod_mixed_sign_float64</summary>

```python
node = onnx.helper.make_node("Mod", inputs=["x", "y"], outputs=["z"], fmod=1)

x = np.array([-4.3, 7.2, 5.0, 4.3, -7.2, 8.0]).astype(np.float64)
y = np.array([2.1, -3.4, 8.0, -2.1, 3.4, 5.0]).astype(np.float64)
z = np.fmod(x, y)  # expected output [-0.1,  0.4,  5. ,  0.1, -0.4,  3.]
expect(node, inputs=[x, y], outputs=[z], name="test_mod_mixed_sign_float64")
```

</details>


<details>
<summary>mod_mixed_sign_int16</summary>

```python
node = onnx.helper.make_node(
    "Mod",
    inputs=["x", "y"],
    outputs=["z"],
)

x = np.array([-4, 7, 5, 4, -7, 8]).astype(np.int16)
y = np.array([2, -3, 8, -2, 3, 5]).astype(np.int16)
z = np.mod(x, y)  # expected output [ 0, -2,  5,  0,  2,  3]
expect(node, inputs=[x, y], outputs=[z], name="test_mod_mixed_sign_int16")
```

</details>


<details>
<summary>mod_mixed_sign_int32</summary>

```python
node = onnx.helper.make_node(
    "Mod",
    inputs=["x", "y"],
    outputs=["z"],
)

x = np.array([-4, 7, 5, 4, -7, 8]).astype(np.int32)
y = np.array([2, -3, 8, -2, 3, 5]).astype(np.int32)
z = np.mod(x, y)  # expected output [ 0, -2,  5,  0,  2,  3]
expect(node, inputs=[x, y], outputs=[z], name="test_mod_mixed_sign_int32")
```

</details>


<details>
<summary>mod_mixed_sign_int64</summary>

```python
node = onnx.helper.make_node(
    "Mod",
    inputs=["x", "y"],
    outputs=["z"],
)

x = np.array([-4, 7, 5, 4, -7, 8]).astype(np.int64)
y = np.array([2, -3, 8, -2, 3, 5]).astype(np.int64)
z = np.mod(x, y)  # expected output [ 0, -2,  5,  0,  2,  3]
expect(node, inputs=[x, y], outputs=[z], name="test_mod_mixed_sign_int64")
```

</details>


<details>
<summary>mod_mixed_sign_int8</summary>

```python
node = onnx.helper.make_node(
    "Mod",
    inputs=["x", "y"],
    outputs=["z"],
)

x = np.array([-4, 7, 5, 4, -7, 8]).astype(np.int8)
y = np.array([2, -3, 8, -2, 3, 5]).astype(np.int8)
z = np.mod(x, y)  # expected output [ 0, -2,  5,  0,  2,  3]
expect(node, inputs=[x, y], outputs=[z], name="test_mod_mixed_sign_int8")
```

</details>


<details>
<summary>mod_uint16</summary>

```python
node = onnx.helper.make_node(
    "Mod",
    inputs=["x", "y"],
    outputs=["z"],
)

x = np.array([4, 7, 5]).astype(np.uint16)
y = np.array([2, 3, 8]).astype(np.uint16)
z = np.mod(x, y)  # expected output [0, 1, 5]
expect(node, inputs=[x, y], outputs=[z], name="test_mod_uint16")
```

</details>


<details>
<summary>mod_uint32</summary>

```python
node = onnx.helper.make_node(
    "Mod",
    inputs=["x", "y"],
    outputs=["z"],
)

x = np.array([4, 7, 5]).astype(np.uint32)
y = np.array([2, 3, 8]).astype(np.uint32)
z = np.mod(x, y)  # expected output [0, 1, 5]
expect(node, inputs=[x, y], outputs=[z], name="test_mod_uint32")
```

</details>


<details>
<summary>mod_uint64</summary>

```python
node = onnx.helper.make_node(
    "Mod",
    inputs=["x", "y"],
    outputs=["z"],
)

x = np.array([4, 7, 5]).astype(np.uint64)
y = np.array([2, 3, 8]).astype(np.uint64)
z = np.mod(x, y)  # expected output [0, 1, 5]
expect(node, inputs=[x, y], outputs=[z], name="test_mod_uint64")
```

</details>


<details>
<summary>mod_uint8</summary>

```python
node = onnx.helper.make_node(
    "Mod",
    inputs=["x", "y"],
    outputs=["z"],
)

x = np.array([4, 7, 5]).astype(np.uint8)
y = np.array([2, 3, 8]).astype(np.uint8)
z = np.mod(x, y)  # expected output [0, 1, 5]
expect(node, inputs=[x, y], outputs=[z], name="test_mod_uint8")
```

</details>


### <a name="Mul"></a><a name="mul">**Mul**</a>

  Performs element-wise binary multiplication (with Numpy-style broadcasting support).

  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

  (Opset 14 change): Extend supported types to include uint8, int8, uint16, and int16.

#### Version

This version of the operator has been available since version 14 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Mul-1">1</a>, <a href="Changelog.md#Mul-6">6</a>, <a href="Changelog.md#Mul-7">7</a>, <a href="Changelog.md#Mul-13">13</a>

#### Inputs

<dl>
<dt><tt>A</tt> (differentiable) : T</dt>
<dd>First operand.</dd>
<dt><tt>B</tt> (differentiable) : T</dt>
<dd>Second operand.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> (differentiable) : T</dt>
<dd>Result, has same element type as two inputs</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to all numeric tensors.</dd>
</dl>


#### Examples

<details>
<summary>mul</summary>

```python
node = onnx.helper.make_node(
    "Mul",
    inputs=["x", "y"],
    outputs=["z"],
)

x = np.array([1, 2, 3]).astype(np.float32)
y = np.array([4, 5, 6]).astype(np.float32)
z = x * y  # expected output [4., 10., 18.]
expect(node, inputs=[x, y], outputs=[z], name="test_mul_example")

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.randn(3, 4, 5).astype(np.float32)
z = x * y
expect(node, inputs=[x, y], outputs=[z], name="test_mul")

x = np.random.randint(4, size=(3, 4, 5), dtype=np.uint8)
y = np.random.randint(24, size=(3, 4, 5), dtype=np.uint8)
z = x * y
expect(node, inputs=[x, y], outputs=[z], name="test_mul_uint8")
```

</details>


<details>
<summary>mul_broadcast</summary>

```python
node = onnx.helper.make_node(
    "Mul",
    inputs=["x", "y"],
    outputs=["z"],
)

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.randn(5).astype(np.float32)
z = x * y
expect(node, inputs=[x, y], outputs=[z], name="test_mul_bcast")
```

</details>


### <a name="Multinomial"></a><a name="multinomial">**Multinomial**</a>

  Generate a tensor of samples from a multinomial distribution according to the probabilities
  of each of the possible outcomes.

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>dtype</tt> : int (default is 6)</dt>
<dd>(Optional) The data type for the elements of the output tensor, if not specified, we will use int32.</dd>
<dt><tt>sample_size</tt> : int (default is 1)</dt>
<dd>Number of times to sample.</dd>
<dt><tt>seed</tt> : float</dt>
<dd>(Optional) Seed to the random generator, if not specified we will auto generate one.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T1</dt>
<dd>Input tensor with shape [batch_size, class_size], where class_size is the number of all possible outcomes. Each value along the axis zero represents the unnormalized log-probability of each corresponding outcome in a batch.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T2</dt>
<dd>Output tensor with shape [batch_size, sample_size], where sample_size is the number of times to sample. Each value along the axis zero represents the outcome of the corresponding sample in a batch.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input types to float tensors.</dd>
<dt><tt>T2</tt> : tensor(int32), tensor(int64)</dt>
<dd>Constrain output types to integral tensors.</dd>
</dl>


### <a name="Neg"></a><a name="neg">**Neg**</a>

  Neg takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where each element flipped sign, y = -x, is applied to
  the tensor elementwise.

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Neg-1">1</a>, <a href="Changelog.md#Neg-6">6</a>

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float), tensor(int32), tensor(int8), tensor(int16), tensor(int64), tensor(float16), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to signed numeric tensors.</dd>
</dl>


#### Examples

<details>
<summary>neg</summary>

```python
node = onnx.helper.make_node(
    "Neg",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([-4, 2]).astype(np.float32)
y = np.negative(x)  # expected output [4., -2.],
expect(node, inputs=[x], outputs=[y], name="test_neg_example")

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.negative(x)
expect(node, inputs=[x], outputs=[y], name="test_neg")
```

</details>


### <a name="NegativeLogLikelihoodLoss"></a><a name="negativeloglikelihoodloss">**NegativeLogLikelihoodLoss**</a>

  A NegativeLogLikelihoodLoss operator computes (weighted) negative log likelihood loss.
  Its "input" tensor has the shape of (N, C, d1, d2, ..., dk) where k >= 0.
  The "input" tensor contains log-probabilities for input[n, :, d_1, d_2,..., d_k] being in a class of [0, C).
  The operator's "target" input tensor has the shape of (N, d1, d2, ..., dk). It encodes class labels (one of C classes)
  or it may contain a special value (indicated by an attribute ignore_index) for N x d1 x d2 x ... x dk samples.
  The loss value for input[n, :, d_1, d_2,...d_k] being classified as class c = target[n][d_1][d_2]...[d_k] is computed as:

  ```
  loss[n][d_1][d_2]...[d_k] = -input[n][c][d_1][d_2]...[d_k].
  ```

  When an optional "weight" is provided, the sample loss is calculated as:

  ```
  loss[n][d_1][d_2]...[d_k] = -input[n][c][d_1][d_2]...[d_k] * weight[c].
  ```

  loss is zero for the case when target-value equals ignore_index.

  ```
  loss[n][d_1][d_2]...[d_k] = 0, when target[n][d_1][d_2]...[d_k] = ignore_index
  ```

  If "reduction" attribute is set to "none", the operator's output will be the above loss with shape (N, d1, d2, ..., dk).
  If "reduction" attribute is set to "mean" (the default attribute value), the output loss is (weight) averaged:

  ```
  mean(loss), if "weight" is not provided,
  ```

  or if weight is provided,

  ```
  sum(loss) / sum(weight[target[n][d_1][d_2]...[d_k]]]), for all samples.
  ```

  If "reduction" attribute is set to "sum", the output is a scalar: `sum(loss)`.

  See also https://pytorch.org/docs/stable/nn.html#torch.nn.NLLLoss.

  Example 1:

  ```
  // negative log likelihood loss, "none" reduction
  N, C, d1 = 2, 3, 2
  input = [[[1.0, 2.0], [2.0, 2.0], [3.0, 2.0]],
            [[0.0, 1.0], [2.0, 2.0], [1.0, 2]]]
  target = [[2, 1], [0, 2]]

  loss = np.zeros((N, d1))
  for n in range(N):
      for d_1 in range(d1):
          c = target[n][d_1]
          loss[n][d_1] = -input[n][c][d_1]

  // print(loss)
  // [[-3. -2.]
  //  [-0. -2.]]
  ```

  Example 2:

  ```
  // weighted negative log likelihood loss, sum reduction
  N, C, d1 = 2, 3, 2
  input = [[[1.0, 2.0], [2.0, 2.0], [3.0, 2.0]],
          [[0.0, 1.0], [2.0, 2.0], [1.0, 2]]]
  target = [[2, 1], [0, 2]]
  weight = [0.2, 0.3, 0.1]
  loss = np.zeros((N, d1))
  for n in range(N):
      for d_1 in range(d1):
          c = target[n][d_1]
          loss[n][d_1] = -input[n][c][d_1] * weight[c]

  loss = np.sum(loss)
  // print(loss)
  // -1.1
  ```

  Example 3:

  ```
  // weighted negative log likelihood loss, mean reduction
  N, C, d1 = 2, 3, 2
  input = [[[1.0, 2.0], [2.0, 2.0], [3.0, 2.0]],
          [[0.0, 1.0], [2.0, 2.0], [1.0, 2]]]
  target = [[2, 1], [0, 2]]
  weight = [0.2, 0.3, 0.1]
  loss = np.zeros((N, d1))
  weight_total = 0
  for n in range(N):
      for d_1 in range(d1):
          c = target[n][d_1]
          loss[n][d_1] = -input[n][c][d_1] * weight[c]
          weight_total = weight_total + weight[c]

  loss = np.sum(loss) / weight_total
  // print(loss)
  // -1.57
  ```

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#NegativeLogLikelihoodLoss-12">12</a>

#### Attributes

<dl>
<dt><tt>ignore_index</tt> : int</dt>
<dd>Specifies a target value that is ignored and does not contribute to the input gradient. It's an optional value.</dd>
<dt><tt>reduction</tt> : string (default is mean)</dt>
<dd>Type of reduction to apply to loss: none, sum, mean (default). 'none': the output is the loss for each sample. 'sum': the output will be summed. 'mean': the sum of the output will be divided by the sum of applied weights.</dd>
</dl>

#### Inputs (2 - 3)

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>Input tensor of shape (N, C) or (N, C, d1, d2, ..., dk).</dd>
<dt><tt>target</tt> (non-differentiable) : Tind</dt>
<dd>Target tensor of shape (N) or (N, d1, d2, ..., dk). Target element value shall be in range of [0, C). If ignore_index is specified, it may have a value outside [0, C) and the target values should either be in the range [0, C) or have the value ignore_index.</dd>
<dt><tt>weight</tt> (optional, non-differentiable) : T</dt>
<dd>Optional rescaling weight tensor. If given, it has to be a tensor of size C. Otherwise, it is treated as if having all ones.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>loss</tt> (differentiable) : T</dt>
<dd>The negative log likelihood loss</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input, weight, and output types to floating-point tensors.</dd>
<dt><tt>Tind</tt> : tensor(int32), tensor(int64)</dt>
<dd>Constrain target to integer types</dd>
</dl>


#### Examples

<details>
<summary>input_shape_is_NC</summary>

```python
reduction = "none"
node = onnx.helper.make_node(
    "NegativeLogLikelihoodLoss",
    inputs=["input", "target"],
    outputs=["loss"],
    reduction=reduction,
)

N, C = 3, 5
np.random.seed(0)
input = np.random.rand(N, C).astype(np.float32)
target = np.random.randint(0, high=C, size=(N,)).astype(np.int64)

negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
    input, target, weight=None, reduction=reduction
)

expect(
    node,
    inputs=[input, target],
    outputs=[negative_log_likelihood_loss],
    name="test_nllloss_NC",
)
```

</details>


<details>
<summary>input_shape_is_NCd1</summary>

```python
reduction = "mean"
node = onnx.helper.make_node(
    "NegativeLogLikelihoodLoss",
    inputs=["input", "target"],
    outputs=["loss"],
    reduction=reduction,
)

N, C, d1 = 3, 5, 2
np.random.seed(0)
input = np.random.rand(N, C, d1).astype(np.float32)
target = np.random.randint(0, high=C, size=(N, d1)).astype(np.int64)

negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
    input, target, weight=None, reduction=reduction
)

expect(
    node,
    inputs=[input, target],
    outputs=[negative_log_likelihood_loss],
    name="test_nllloss_NCd1",
)
```

</details>


<details>
<summary>input_shape_is_NCd1_ii</summary>

```python
reduction = "mean"
ignore_index = np.int64(1)
node = onnx.helper.make_node(
    "NegativeLogLikelihoodLoss",
    inputs=["input", "target"],
    outputs=["loss"],
    reduction=reduction,
    ignore_index=ignore_index,
)

N, C, d1 = 3, 5, 2
np.random.seed(0)
input = np.random.rand(N, C, d1).astype(np.float32)
target = np.random.randint(0, high=C, size=(N, d1)).astype(np.int64)
target[0][0] = np.int64(1)

negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
    input, target, weight=None, reduction=reduction, ignore_index=ignore_index
)

expect(
    node,
    inputs=[input, target],
    outputs=[negative_log_likelihood_loss],
    name="test_nllloss_NCd1_ii",
)
```

</details>


<details>
<summary>input_shape_is_NCd1_mean_weight_negative_ii</summary>

```python
reduction = "mean"
ignore_index = np.int64(-1)

node = onnx.helper.make_node(
    "NegativeLogLikelihoodLoss",
    inputs=["input", "target", "weight"],
    outputs=["loss"],
    reduction=reduction,
    ignore_index=ignore_index,
)

N, C, dim1 = 3, 5, 6
np.random.seed(0)
input = np.random.rand(N, C, dim1).astype(np.float32)
target = np.random.randint(0, high=C, size=(N, dim1)).astype(np.int64)
target[0][0] = -1
weight = np.random.rand(C).astype(np.float32)

negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
    input, target, weight=weight, reduction=reduction, ignore_index=ignore_index
)

expect(
    node,
    inputs=[input, target, weight],
    outputs=[negative_log_likelihood_loss],
    name="test_nllloss_NCd1_mean_weight_negative_ii",
)
```

</details>


<details>
<summary>input_shape_is_NCd1_weight</summary>

```python
reduction = "mean"
node = onnx.helper.make_node(
    "NegativeLogLikelihoodLoss",
    inputs=["input", "target", "weight"],
    outputs=["loss"],
    reduction=reduction,
)

N, C, d1 = 3, 5, 2
np.random.seed(0)
input = np.random.rand(N, C, d1).astype(np.float32)
target = np.random.randint(0, high=C, size=(N, d1)).astype(np.int64)
weight = np.random.rand(C).astype(np.float32)

negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
    input, target, weight=weight, reduction=reduction
)

expect(
    node,
    inputs=[input, target, weight],
    outputs=[negative_log_likelihood_loss],
    name="test_nllloss_NCd1_weight",
)
```

</details>


<details>
<summary>input_shape_is_NCd1_weight_ii</summary>

```python
reduction = "mean"
ignore_index = np.int64(1)
node = onnx.helper.make_node(
    "NegativeLogLikelihoodLoss",
    inputs=["input", "target", "weight"],
    outputs=["loss"],
    reduction=reduction,
    ignore_index=ignore_index,
)

N, C, d1 = 3, 5, 2
np.random.seed(0)
input = np.random.rand(N, C, d1).astype(np.float32)
target = np.random.randint(0, high=C, size=(N, d1)).astype(np.int64)
target[0][0] = np.int64(1)
weight = np.random.rand(C).astype(np.float32)

negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
    input, target, weight=weight, reduction=reduction, ignore_index=ignore_index
)

expect(
    node,
    inputs=[input, target, weight],
    outputs=[negative_log_likelihood_loss],
    name="test_nllloss_NCd1_weight_ii",
)
```

</details>


<details>
<summary>input_shape_is_NCd1d2</summary>

```python
reduction = "none"
node = onnx.helper.make_node(
    "NegativeLogLikelihoodLoss",
    inputs=["input", "target"],
    outputs=["loss"],
    reduction=reduction,
)

N, C, dim1, dim2 = 3, 5, 6, 6
np.random.seed(0)
input = np.random.rand(N, C, dim1, dim2).astype(np.float32)
target = np.random.randint(0, high=C, size=(N, dim1, dim2)).astype(np.int64)

negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
    input, target, weight=None, reduction=reduction
)

expect(
    node,
    inputs=[input, target],
    outputs=[negative_log_likelihood_loss],
    name="test_nllloss_NCd1d2",
)
```

</details>


<details>
<summary>input_shape_is_NCd1d2_no_weight_reduction_mean_ii</summary>

```python
reduction = "mean"
ignore_index = np.int64(1)
node = onnx.helper.make_node(
    "NegativeLogLikelihoodLoss",
    inputs=["input", "target"],
    outputs=["loss"],
    reduction=reduction,
    ignore_index=ignore_index,
)

N, C, dim1, dim2 = 3, 5, 6, 6
np.random.seed(0)
input = np.random.rand(N, C, dim1, dim2).astype(np.float32)
target = np.random.randint(0, high=C, size=(N, dim1, dim2)).astype(np.int64)
target[0][0][0] = np.int64(1)

negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
    input, target, reduction=reduction, ignore_index=ignore_index
)

expect(
    node,
    inputs=[input, target],
    outputs=[negative_log_likelihood_loss],
    name="test_nllloss_NCd1d2_no_weight_reduction_mean_ii",
)
```

</details>


<details>
<summary>input_shape_is_NCd1d2_reduction_mean</summary>

```python
reduction = "mean"
node = onnx.helper.make_node(
    "NegativeLogLikelihoodLoss",
    inputs=["input", "target"],
    outputs=["loss"],
    reduction=reduction,
)

N, C, dim1, dim2 = 3, 5, 6, 6
np.random.seed(0)
input = np.random.rand(N, C, dim1, dim2).astype(np.float32)
target = np.random.randint(0, high=C, size=(N, dim1, dim2)).astype(np.int64)

negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
    input, target, weight=None, reduction=reduction
)

expect(
    node,
    inputs=[input, target],
    outputs=[negative_log_likelihood_loss],
    name="test_nllloss_NCd1d2_reduction_mean",
)
```

</details>


<details>
<summary>input_shape_is_NCd1d2_reduction_sum</summary>

```python
reduction = "sum"
node = onnx.helper.make_node(
    "NegativeLogLikelihoodLoss",
    inputs=["input", "target"],
    outputs=["loss"],
    reduction=reduction,
)

N, C, dim1, dim2 = 3, 5, 6, 6
np.random.seed(0)
input = np.random.rand(N, C, dim1, dim2).astype(np.float32)
target = np.random.randint(0, high=C, size=(N, dim1, dim2)).astype(np.int64)

negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
    input, target, weight=None, reduction=reduction
)

expect(
    node,
    inputs=[input, target],
    outputs=[negative_log_likelihood_loss],
    name="test_nllloss_NCd1d2_reduction_sum",
)
```

</details>


<details>
<summary>input_shape_is_NCd1d2_with_weight</summary>

```python
reduction = "none"
node = onnx.helper.make_node(
    "NegativeLogLikelihoodLoss",
    inputs=["input", "target", "weight"],
    outputs=["loss"],
    reduction=reduction,
)

N, C, dim1, dim2 = 3, 5, 6, 6
np.random.seed(0)
input = np.random.rand(N, C, dim1, dim2).astype(np.float32)
target = np.random.randint(0, high=C, size=(N, dim1, dim2)).astype(np.int64)
weight = np.random.rand(C).astype(np.float32)

negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
    input, target, weight=weight, reduction=reduction
)

expect(
    node,
    inputs=[input, target, weight],
    outputs=[negative_log_likelihood_loss],
    name="test_nllloss_NCd1d2_with_weight",
)
```

</details>


<details>
<summary>input_shape_is_NCd1d2_with_weight_reduction_mean</summary>

```python
reduction = "mean"
node = onnx.helper.make_node(
    "NegativeLogLikelihoodLoss",
    inputs=["input", "target", "weight"],
    outputs=["loss"],
    reduction=reduction,
)

N, C, dim1, dim2 = 3, 5, 6, 6
np.random.seed(0)
input = np.random.rand(N, C, dim1, dim2).astype(np.float32)
target = np.random.randint(0, high=C, size=(N, dim1, dim2)).astype(np.int64)
weight = np.random.rand(C).astype(np.float32)

negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
    input, target, weight=weight, reduction=reduction
)

expect(
    node,
    inputs=[input, target, weight],
    outputs=[negative_log_likelihood_loss],
    name="test_nllloss_NCd1d2_with_weight_reduction_mean",
)
```

</details>


<details>
<summary>input_shape_is_NCd1d2_with_weight_reduction_sum</summary>

```python
reduction = "sum"
node = onnx.helper.make_node(
    "NegativeLogLikelihoodLoss",
    inputs=["input", "target", "weight"],
    outputs=["loss"],
    reduction=reduction,
)

N, C, dim1, dim2 = 3, 5, 6, 6
np.random.seed(0)
input = np.random.rand(N, C, dim1, dim2).astype(np.float32)
target = np.random.randint(0, high=C, size=(N, dim1, dim2)).astype(np.int64)
weight = np.random.rand(C).astype(np.float32)

negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
    input, target, weight=weight, reduction=reduction
)

expect(
    node,
    inputs=[input, target, weight],
    outputs=[negative_log_likelihood_loss],
    name="test_nllloss_NCd1d2_with_weight_reduction_sum",
)
```

</details>


<details>
<summary>input_shape_is_NCd1d2_with_weight_reduction_sum_ii</summary>

```python
reduction = "sum"
ignore_index = np.int64(0)
node = onnx.helper.make_node(
    "NegativeLogLikelihoodLoss",
    inputs=["input", "target", "weight"],
    outputs=["loss"],
    reduction=reduction,
    ignore_index=ignore_index,
)

N, C, dim1, dim2 = 3, 5, 6, 6
np.random.seed(0)
input = np.random.rand(N, C, dim1, dim2).astype(np.float32)
target = np.random.randint(0, high=C, size=(N, dim1, dim2)).astype(np.int64)
target[0][0][0] = np.int64(0)
weight = np.random.rand(C).astype(np.float32)

negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
    input, target, weight=weight, reduction=reduction, ignore_index=ignore_index
)

expect(
    node,
    inputs=[input, target, weight],
    outputs=[negative_log_likelihood_loss],
    name="test_nllloss_NCd1d2_with_weight_reduction_sum_ii",
)
```

</details>


<details>
<summary>input_shape_is_NCd1d2d3_none_no_weight_negative_ii</summary>

```python
reduction = "none"
ignore_index = np.int64(-5)

node = onnx.helper.make_node(
    "NegativeLogLikelihoodLoss",
    inputs=["input", "target"],
    outputs=["loss"],
    reduction=reduction,
    ignore_index=ignore_index,
)

N, C, dim1, dim2, dim3 = 3, 5, 6, 6, 5
np.random.seed(0)
input = np.random.rand(N, C, dim1, dim2, dim3).astype(np.float32)
target = np.random.randint(0, high=C, size=(N, dim1, dim2, dim3)).astype(
    np.int64
)
target[0][0][0][0] = -5

negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
    input, target, reduction=reduction, ignore_index=ignore_index
)

expect(
    node,
    inputs=[input, target],
    outputs=[negative_log_likelihood_loss],
    name="test_nllloss_NCd1d2d3_none_no_weight_negative_ii",
)
```

</details>


<details>
<summary>input_shape_is_NCd1d2d3_sum_weight_high_ii</summary>

```python
reduction = "sum"
ignore_index = np.int64(10)

node = onnx.helper.make_node(
    "NegativeLogLikelihoodLoss",
    inputs=["input", "target", "weight"],
    outputs=["loss"],
    reduction=reduction,
    ignore_index=ignore_index,
)

N, C = 3, 5
np.random.seed(0)
input = np.random.rand(N, C).astype(np.float32)
target = np.random.randint(0, high=C, size=(N)).astype(np.int64)
target[0] = 10
weight = np.random.rand(C).astype(np.float32)

negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
    input, target, weight=weight, reduction=reduction, ignore_index=ignore_index
)

expect(
    node,
    inputs=[input, target, weight],
    outputs=[negative_log_likelihood_loss],
    name="test_nllloss_NCd1d2d3_sum_weight_high_ii",
)
```

</details>


<details>
<summary>input_shape_is_NCd1d2d3d4d5_mean_weight</summary>

```python
reduction = "mean"

node = onnx.helper.make_node(
    "NegativeLogLikelihoodLoss",
    inputs=["input", "target", "weight"],
    outputs=["loss"],
    reduction=reduction,
)

N, C, dim1, dim2, dim3, dim4, dim5 = 3, 5, 6, 6, 5, 3, 4
np.random.seed(0)
input = np.random.rand(N, C, dim1, dim2, dim3, dim4, dim5).astype(np.float32)
target = np.random.randint(
    0, high=C, size=(N, dim1, dim2, dim3, dim4, dim5)
).astype(np.int64)
weight = np.random.rand(C).astype(np.float32)

negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
    input, target, weight=weight, reduction=reduction
)

expect(
    node,
    inputs=[input, target, weight],
    outputs=[negative_log_likelihood_loss],
    name="test_nllloss_NCd1d2d3d4d5_mean_weight",
)
```

</details>


<details>
<summary>input_shape_is_NCd1d2d3d4d5_none_no_weight</summary>

```python
reduction = "none"

node = onnx.helper.make_node(
    "NegativeLogLikelihoodLoss",
    inputs=["input", "target"],
    outputs=["loss"],
    reduction=reduction,
)

N, C, dim1, dim2, dim3, dim4, dim5 = 3, 5, 6, 6, 5, 3, 4
np.random.seed(0)
input = np.random.rand(N, C, dim1, dim2, dim3, dim4, dim5).astype(np.float32)
target = np.random.randint(
    0, high=C, size=(N, dim1, dim2, dim3, dim4, dim5)
).astype(np.int64)

negative_log_likelihood_loss = compute_negative_log_likelihood_loss(
    input, target, reduction=reduction
)

expect(
    node,
    inputs=[input, target],
    outputs=[negative_log_likelihood_loss],
    name="test_nllloss_NCd1d2d3d4d5_none_no_weight",
)
```

</details>


### <a name="NonMaxSuppression"></a><a name="nonmaxsuppression">**NonMaxSuppression**</a>

  Filter out boxes that have high intersection-over-union (IOU) overlap with previously selected boxes.
  Bounding boxes with score less than score_threshold are removed. Bounding box format is indicated by attribute center_point_box.
  Note that this algorithm is agnostic to where the origin is in the coordinate system and more generally is invariant to
  orthogonal transformations and translations of the coordinate system; thus translating or reflections of the coordinate system
  result in the same boxes being selected by the algorithm.
  The selected_indices output is a set of integers indexing into the input collection of bounding boxes representing the selected boxes.
  The bounding box coordinates corresponding to the selected indices can then be obtained using the Gather or GatherND operation.

#### Version

This version of the operator has been available since version 11 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#NonMaxSuppression-10">10</a>

#### Attributes

<dl>
<dt><tt>center_point_box</tt> : int (default is 0)</dt>
<dd>Integer indicate the format of the box data. The default is 0. 0 - the box data is supplied as [y1, x1, y2, x2] where (y1, x1) and (y2, x2) are the coordinates of any diagonal pair of box corners and the coordinates can be provided as normalized (i.e., lying in the interval [0, 1]) or absolute. Mostly used for TF models. 1 - the box data is supplied as [x_center, y_center, width, height]. Mostly used for Pytorch models.</dd>
</dl>

#### Inputs (2 - 5)

<dl>
<dt><tt>boxes</tt> : tensor(float)</dt>
<dd>An input tensor with shape [num_batches, spatial_dimension, 4]. The single box data format is indicated by center_point_box.</dd>
<dt><tt>scores</tt> : tensor(float)</dt>
<dd>An input tensor with shape [num_batches, num_classes, spatial_dimension]</dd>
<dt><tt>max_output_boxes_per_class</tt> (optional) : tensor(int64)</dt>
<dd>Integer representing the maximum number of boxes to be selected per batch per class. It is a scalar. Default to 0, which means no output.</dd>
<dt><tt>iou_threshold</tt> (optional) : tensor(float)</dt>
<dd>Float representing the threshold for deciding whether boxes overlap too much with respect to IOU. It is scalar. Value range [0, 1]. Default to 0.</dd>
<dt><tt>score_threshold</tt> (optional) : tensor(float)</dt>
<dd>Float representing the threshold for deciding when to remove boxes based on score. It is a scalar.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>selected_indices</tt> : tensor(int64)</dt>
<dd>selected indices from the boxes tensor. [num_selected_indices, 3], the selected index format is [batch_index, class_index, box_index].</dd>
</dl>

#### Type Constraints



#### Examples

<details>
<summary>nonmaxsuppression_center_point_box_format</summary>

```python
node = onnx.helper.make_node(
    "NonMaxSuppression",
    inputs=[
        "boxes",
        "scores",
        "max_output_boxes_per_class",
        "iou_threshold",
        "score_threshold",
    ],
    outputs=["selected_indices"],
    center_point_box=1,
)
boxes = np.array(
    [
        [
            [0.5, 0.5, 1.0, 1.0],
            [0.5, 0.6, 1.0, 1.0],
            [0.5, 0.4, 1.0, 1.0],
            [0.5, 10.5, 1.0, 1.0],
            [0.5, 10.6, 1.0, 1.0],
            [0.5, 100.5, 1.0, 1.0],
        ]
    ]
).astype(np.float32)
scores = np.array([[[0.9, 0.75, 0.6, 0.95, 0.5, 0.3]]]).astype(np.float32)
max_output_boxes_per_class = np.array([3]).astype(np.int64)
iou_threshold = np.array([0.5]).astype(np.float32)
score_threshold = np.array([0.0]).astype(np.float32)
selected_indices = np.array([[0, 0, 3], [0, 0, 0], [0, 0, 5]]).astype(np.int64)

expect(
    node,
    inputs=[
        boxes,
        scores,
        max_output_boxes_per_class,
        iou_threshold,
        score_threshold,
    ],
    outputs=[selected_indices],
    name="test_nonmaxsuppression_center_point_box_format",
)
```

</details>


<details>
<summary>nonmaxsuppression_flipped_coordinates</summary>

```python
node = onnx.helper.make_node(
    "NonMaxSuppression",
    inputs=[
        "boxes",
        "scores",
        "max_output_boxes_per_class",
        "iou_threshold",
        "score_threshold",
    ],
    outputs=["selected_indices"],
)
boxes = np.array(
    [
        [
            [1.0, 1.0, 0.0, 0.0],
            [0.0, 0.1, 1.0, 1.1],
            [0.0, 0.9, 1.0, -0.1],
            [0.0, 10.0, 1.0, 11.0],
            [1.0, 10.1, 0.0, 11.1],
            [1.0, 101.0, 0.0, 100.0],
        ]
    ]
).astype(np.float32)
scores = np.array([[[0.9, 0.75, 0.6, 0.95, 0.5, 0.3]]]).astype(np.float32)
max_output_boxes_per_class = np.array([3]).astype(np.int64)
iou_threshold = np.array([0.5]).astype(np.float32)
score_threshold = np.array([0.0]).astype(np.float32)
selected_indices = np.array([[0, 0, 3], [0, 0, 0], [0, 0, 5]]).astype(np.int64)

expect(
    node,
    inputs=[
        boxes,
        scores,
        max_output_boxes_per_class,
        iou_threshold,
        score_threshold,
    ],
    outputs=[selected_indices],
    name="test_nonmaxsuppression_flipped_coordinates",
)
```

</details>


<details>
<summary>nonmaxsuppression_identical_boxes</summary>

```python
node = onnx.helper.make_node(
    "NonMaxSuppression",
    inputs=[
        "boxes",
        "scores",
        "max_output_boxes_per_class",
        "iou_threshold",
        "score_threshold",
    ],
    outputs=["selected_indices"],
)
boxes = np.array(
    [
        [
            [0.0, 0.0, 1.0, 1.0],
            [0.0, 0.0, 1.0, 1.0],
            [0.0, 0.0, 1.0, 1.0],
            [0.0, 0.0, 1.0, 1.0],
            [0.0, 0.0, 1.0, 1.0],
            [0.0, 0.0, 1.0, 1.0],
            [0.0, 0.0, 1.0, 1.0],
            [0.0, 0.0, 1.0, 1.0],
            [0.0, 0.0, 1.0, 1.0],
            [0.0, 0.0, 1.0, 1.0],
        ]
    ]
).astype(np.float32)
scores = np.array(
    [[[0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9]]]
).astype(np.float32)
max_output_boxes_per_class = np.array([3]).astype(np.int64)
iou_threshold = np.array([0.5]).astype(np.float32)
score_threshold = np.array([0.0]).astype(np.float32)
selected_indices = np.array([[0, 0, 0]]).astype(np.int64)

expect(
    node,
    inputs=[
        boxes,
        scores,
        max_output_boxes_per_class,
        iou_threshold,
        score_threshold,
    ],
    outputs=[selected_indices],
    name="test_nonmaxsuppression_identical_boxes",
)
```

</details>


<details>
<summary>nonmaxsuppression_limit_output_size</summary>

```python
node = onnx.helper.make_node(
    "NonMaxSuppression",
    inputs=[
        "boxes",
        "scores",
        "max_output_boxes_per_class",
        "iou_threshold",
        "score_threshold",
    ],
    outputs=["selected_indices"],
)
boxes = np.array(
    [
        [
            [0.0, 0.0, 1.0, 1.0],
            [0.0, 0.1, 1.0, 1.1],
            [0.0, -0.1, 1.0, 0.9],
            [0.0, 10.0, 1.0, 11.0],
            [0.0, 10.1, 1.0, 11.1],
            [0.0, 100.0, 1.0, 101.0],
        ]
    ]
).astype(np.float32)
scores = np.array([[[0.9, 0.75, 0.6, 0.95, 0.5, 0.3]]]).astype(np.float32)
max_output_boxes_per_class = np.array([2]).astype(np.int64)
iou_threshold = np.array([0.5]).astype(np.float32)
score_threshold = np.array([0.0]).astype(np.float32)
selected_indices = np.array([[0, 0, 3], [0, 0, 0]]).astype(np.int64)

expect(
    node,
    inputs=[
        boxes,
        scores,
        max_output_boxes_per_class,
        iou_threshold,
        score_threshold,
    ],
    outputs=[selected_indices],
    name="test_nonmaxsuppression_limit_output_size",
)
```

</details>


<details>
<summary>nonmaxsuppression_single_box</summary>

```python
node = onnx.helper.make_node(
    "NonMaxSuppression",
    inputs=[
        "boxes",
        "scores",
        "max_output_boxes_per_class",
        "iou_threshold",
        "score_threshold",
    ],
    outputs=["selected_indices"],
)
boxes = np.array([[[0.0, 0.0, 1.0, 1.0]]]).astype(np.float32)
scores = np.array([[[0.9]]]).astype(np.float32)
max_output_boxes_per_class = np.array([3]).astype(np.int64)
iou_threshold = np.array([0.5]).astype(np.float32)
score_threshold = np.array([0.0]).astype(np.float32)
selected_indices = np.array([[0, 0, 0]]).astype(np.int64)

expect(
    node,
    inputs=[
        boxes,
        scores,
        max_output_boxes_per_class,
        iou_threshold,
        score_threshold,
    ],
    outputs=[selected_indices],
    name="test_nonmaxsuppression_single_box",
)
```

</details>


<details>
<summary>nonmaxsuppression_suppress_by_IOU</summary>

```python
node = onnx.helper.make_node(
    "NonMaxSuppression",
    inputs=[
        "boxes",
        "scores",
        "max_output_boxes_per_class",
        "iou_threshold",
        "score_threshold",
    ],
    outputs=["selected_indices"],
)
boxes = np.array(
    [
        [
            [0.0, 0.0, 1.0, 1.0],
            [0.0, 0.1, 1.0, 1.1],
            [0.0, -0.1, 1.0, 0.9],
            [0.0, 10.0, 1.0, 11.0],
            [0.0, 10.1, 1.0, 11.1],
            [0.0, 100.0, 1.0, 101.0],
        ]
    ]
).astype(np.float32)
scores = np.array([[[0.9, 0.75, 0.6, 0.95, 0.5, 0.3]]]).astype(np.float32)
max_output_boxes_per_class = np.array([3]).astype(np.int64)
iou_threshold = np.array([0.5]).astype(np.float32)
score_threshold = np.array([0.0]).astype(np.float32)
selected_indices = np.array([[0, 0, 3], [0, 0, 0], [0, 0, 5]]).astype(np.int64)

expect(
    node,
    inputs=[
        boxes,
        scores,
        max_output_boxes_per_class,
        iou_threshold,
        score_threshold,
    ],
    outputs=[selected_indices],
    name="test_nonmaxsuppression_suppress_by_IOU",
)
```

</details>


<details>
<summary>nonmaxsuppression_suppress_by_IOU_and_scores</summary>

```python
node = onnx.helper.make_node(
    "NonMaxSuppression",
    inputs=[
        "boxes",
        "scores",
        "max_output_boxes_per_class",
        "iou_threshold",
        "score_threshold",
    ],
    outputs=["selected_indices"],
)
boxes = np.array(
    [
        [
            [0.0, 0.0, 1.0, 1.0],
            [0.0, 0.1, 1.0, 1.1],
            [0.0, -0.1, 1.0, 0.9],
            [0.0, 10.0, 1.0, 11.0],
            [0.0, 10.1, 1.0, 11.1],
            [0.0, 100.0, 1.0, 101.0],
        ]
    ]
).astype(np.float32)
scores = np.array([[[0.9, 0.75, 0.6, 0.95, 0.5, 0.3]]]).astype(np.float32)
max_output_boxes_per_class = np.array([3]).astype(np.int64)
iou_threshold = np.array([0.5]).astype(np.float32)
score_threshold = np.array([0.4]).astype(np.float32)
selected_indices = np.array([[0, 0, 3], [0, 0, 0]]).astype(np.int64)

expect(
    node,
    inputs=[
        boxes,
        scores,
        max_output_boxes_per_class,
        iou_threshold,
        score_threshold,
    ],
    outputs=[selected_indices],
    name="test_nonmaxsuppression_suppress_by_IOU_and_scores",
)
```

</details>


<details>
<summary>nonmaxsuppression_two_batches</summary>

```python
node = onnx.helper.make_node(
    "NonMaxSuppression",
    inputs=[
        "boxes",
        "scores",
        "max_output_boxes_per_class",
        "iou_threshold",
        "score_threshold",
    ],
    outputs=["selected_indices"],
)
boxes = np.array(
    [
        [
            [0.0, 0.0, 1.0, 1.0],
            [0.0, 0.1, 1.0, 1.1],
            [0.0, -0.1, 1.0, 0.9],
            [0.0, 10.0, 1.0, 11.0],
            [0.0, 10.1, 1.0, 11.1],
            [0.0, 100.0, 1.0, 101.0],
        ],
        [
            [0.0, 0.0, 1.0, 1.0],
            [0.0, 0.1, 1.0, 1.1],
            [0.0, -0.1, 1.0, 0.9],
            [0.0, 10.0, 1.0, 11.0],
            [0.0, 10.1, 1.0, 11.1],
            [0.0, 100.0, 1.0, 101.0],
        ],
    ]
).astype(np.float32)
scores = np.array(
    [[[0.9, 0.75, 0.6, 0.95, 0.5, 0.3]], [[0.9, 0.75, 0.6, 0.95, 0.5, 0.3]]]
).astype(np.float32)
max_output_boxes_per_class = np.array([2]).astype(np.int64)
iou_threshold = np.array([0.5]).astype(np.float32)
score_threshold = np.array([0.0]).astype(np.float32)
selected_indices = np.array(
    [[0, 0, 3], [0, 0, 0], [1, 0, 3], [1, 0, 0]]
).astype(np.int64)

expect(
    node,
    inputs=[
        boxes,
        scores,
        max_output_boxes_per_class,
        iou_threshold,
        score_threshold,
    ],
    outputs=[selected_indices],
    name="test_nonmaxsuppression_two_batches",
)
```

</details>


<details>
<summary>nonmaxsuppression_two_classes</summary>

```python
node = onnx.helper.make_node(
    "NonMaxSuppression",
    inputs=[
        "boxes",
        "scores",
        "max_output_boxes_per_class",
        "iou_threshold",
        "score_threshold",
    ],
    outputs=["selected_indices"],
)
boxes = np.array(
    [
        [
            [0.0, 0.0, 1.0, 1.0],
            [0.0, 0.1, 1.0, 1.1],
            [0.0, -0.1, 1.0, 0.9],
            [0.0, 10.0, 1.0, 11.0],
            [0.0, 10.1, 1.0, 11.1],
            [0.0, 100.0, 1.0, 101.0],
        ]
    ]
).astype(np.float32)
scores = np.array(
    [[[0.9, 0.75, 0.6, 0.95, 0.5, 0.3], [0.9, 0.75, 0.6, 0.95, 0.5, 0.3]]]
).astype(np.float32)
max_output_boxes_per_class = np.array([2]).astype(np.int64)
iou_threshold = np.array([0.5]).astype(np.float32)
score_threshold = np.array([0.0]).astype(np.float32)
selected_indices = np.array(
    [[0, 0, 3], [0, 0, 0], [0, 1, 3], [0, 1, 0]]
).astype(np.int64)

expect(
    node,
    inputs=[
        boxes,
        scores,
        max_output_boxes_per_class,
        iou_threshold,
        score_threshold,
    ],
    outputs=[selected_indices],
    name="test_nonmaxsuppression_two_classes",
)
```

</details>


### <a name="NonZero"></a><a name="nonzero">**NonZero**</a>

  Returns the indices of the elements that are non-zero
      (in row-major order - by dimension).
      NonZero behaves similar to numpy.nonzero:
      https://docs.scipy.org/doc/numpy/reference/generated/numpy.nonzero.html,
      but for scalar input, NonZero produces output shape (0, N) instead of (1, N), which is different from Numpy's behavior.

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#NonZero-9">9</a>

#### Inputs

<dl>
<dt><tt>X</tt> (non-differentiable) : T</dt>
<dd>input</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (non-differentiable) : tensor(int64)</dt>
<dd>output</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain to all tensor types.</dd>
</dl>


#### Examples

<details>
<summary>nonzero</summary>

```python
node = onnx.helper.make_node(
    "NonZero",
    inputs=["condition"],
    outputs=["result"],
)

condition = np.array([[1, 0], [1, 1]], dtype=bool)
result = np.array(
    np.nonzero(condition), dtype=np.int64
)  # expected output [[0, 1, 1], [0, 0, 1]]
expect(node, inputs=[condition], outputs=[result], name="test_nonzero_example")
```

</details>


### <a name="Not"></a><a name="not">**Not**</a>

  Returns the negation of the input tensor element-wise.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>X</tt> (non-differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (non-differentiable) : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(bool)</dt>
<dd>Constrain input/output to boolean tensors.</dd>
</dl>


#### Examples

<details>
<summary>not</summary>

```python
node = onnx.helper.make_node(
    "Not",
    inputs=["x"],
    outputs=["not"],
)

# 2d
x = (np.random.randn(3, 4) > 0).astype(bool)
expect(node, inputs=[x], outputs=[np.logical_not(x)], name="test_not_2d")

# 3d
x = (np.random.randn(3, 4, 5) > 0).astype(bool)
expect(node, inputs=[x], outputs=[np.logical_not(x)], name="test_not_3d")

# 4d
x = (np.random.randn(3, 4, 5, 6) > 0).astype(bool)
expect(node, inputs=[x], outputs=[np.logical_not(x)], name="test_not_4d")
```

</details>


### <a name="OneHot"></a><a name="onehot">**OneHot**</a>

  Produces a one-hot tensor based on inputs.
      The locations represented by the index values in the 'indices' input tensor will have 'on_value'
      and the other locations will have 'off_value' in the output tensor, where 'on_value' and 'off_value'
      are specified as part of required input argument 'values', which is a two-element tensor of format
      [off_value, on_value]. The rank of the output tensor will be one greater than the rank of the
      input tensor. The additional dimension is for one-hot representation. The additional dimension will
      be inserted at the position specified by 'axis'. If 'axis' is not specified then then additional
      dimension will be inserted as the innermost dimension, i.e. axis=-1. The size of the additional
      dimension is specified by required scalar input 'depth'. The type of the output tensor is the same
      as the type of the 'values' input. Any entries in the 'indices' input tensor with values outside
      the range [-depth, depth-1] will result in one-hot representation with all 'off_value' values in the
      output tensor.

      when axis = 0:
      output[input[i, j, k], i, j, k] = 1 for all i, j, k and 0 otherwise.

      when axis = -1:
      output[i, j, k, input[i, j, k]] = 1 for all i, j, k and 0 otherwise.


#### Version

This version of the operator has been available since version 11 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#OneHot-9">9</a>

#### Attributes

<dl>
<dt><tt>axis</tt> : int (default is -1)</dt>
<dd>(Optional) Axis along which one-hot representation in added. Default: axis=-1. axis=-1 means that the additional dimension will be inserted as the innermost/last dimension in the output tensor. Negative value means counting dimensions from the back. Accepted range is [-r-1, r] where r = rank(indices).</dd>
</dl>

#### Inputs

<dl>
<dt><tt>indices</tt> (non-differentiable) : T1</dt>
<dd>Input tensor containing indices. Any entries in the 'indices' input tensor with values outside the range [-depth, depth-1] will result in one-hot representation with all 'off_value' values in the output tensor.In case 'indices' is of non-integer type, the values will be casted to int64 before use.</dd>
<dt><tt>depth</tt> (non-differentiable) : T2</dt>
<dd>Scalar or Rank 1 tensor containing exactly one element, specifying the number of classes in one-hot tensor. This is also the size of the one-hot dimension (specified by 'axis' attribute) added on in the output tensor. The values in the 'indices' input tensor are expected to be in the range [-depth, depth-1]. In case 'depth' is of non-integer type, it will be casted to int64 before use.</dd>
<dt><tt>values</tt> (non-differentiable) : T3</dt>
<dd>Rank 1 tensor containing exactly two elements, in the format [off_value, on_value], where 'on_value' is the value used for filling locations specified in 'indices' input tensor, and 'off_value' is the value used for filling locations other than those specified in 'indices' input tensor. </dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (non-differentiable) : T3</dt>
<dd>Tensor of rank one greater than input tensor 'indices', i.e. rank(output) = rank(indices) + 1. The data type for the elements of the output tensor is the same as the type of input 'values' is used.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input to only numeric types.</dd>
<dt><tt>T2</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input to only numeric types.</dd>
<dt><tt>T3</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain to any tensor type.</dd>
</dl>


#### Examples

<details>
<summary>with_axis</summary>

```python
axisValue = 1
on_value = 3
off_value = 1
output_type = np.float32
node = onnx.helper.make_node(
    "OneHot",
    inputs=["indices", "depth", "values"],
    outputs=["y"],
    axis=axisValue,
)
indices = np.array([[1, 9], [2, 4]], dtype=np.float32)
depth = np.float32(10)
values = np.array([off_value, on_value], dtype=output_type)
y = one_hot(indices, depth, axis=axisValue, dtype=output_type)
y = y * (on_value - off_value) + off_value
expect(
    node,
    inputs=[indices, depth, values],
    outputs=[y],
    name="test_onehot_with_axis",
)
```

</details>


<details>
<summary>with_negative_axis</summary>

```python
axisValue = -2
on_value = 3
off_value = 1
output_type = np.float32
node = onnx.helper.make_node(
    "OneHot",
    inputs=["indices", "depth", "values"],
    outputs=["y"],
    axis=axisValue,
)
indices = np.array([[1, 9], [2, 4]], dtype=np.float32)
depth = np.float32(10)
values = np.array([off_value, on_value], dtype=output_type)
y = one_hot(indices, depth, axis=axisValue, dtype=output_type)
y = y * (on_value - off_value) + off_value
expect(
    node,
    inputs=[indices, depth, values],
    outputs=[y],
    name="test_onehot_with_negative_axis",
)
```

</details>


<details>
<summary>with_negative_indices</summary>

```python
axisValue = 1
on_value = 3
off_value = 1
output_type = np.float32
node = onnx.helper.make_node(
    "OneHot",
    inputs=["indices", "depth", "values"],
    outputs=["y"],
    axis=axisValue,
)
indices = np.array([0, -7, -8], dtype=np.int64)

# print(y)
# [[3. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
#  [1. 1. 1. 3. 1. 1. 1. 1. 1. 1.]
#  [1. 1. 3. 1. 1. 1. 1. 1. 1. 1.]]

depth = np.float32(10)
values = np.array([off_value, on_value], dtype=output_type)
y = one_hot(indices, depth, axis=axisValue, dtype=output_type)
y = y * (on_value - off_value) + off_value
expect(
    node,
    inputs=[indices, depth, values],
    outputs=[y],
    name="test_onehot_negative_indices",
)
```

</details>


<details>
<summary>without_axis</summary>

```python
on_value = 5
off_value = 2
output_type = np.int32
node = onnx.helper.make_node(
    "OneHot", inputs=["indices", "depth", "values"], outputs=["y"]
)
indices = np.array([0, 7, 8], dtype=np.int64)
depth = np.float32(12)
values = np.array([off_value, on_value], dtype=output_type)
y = one_hot(indices, depth, dtype=output_type)
y = y * (on_value - off_value) + off_value
expect(
    node,
    inputs=[indices, depth, values],
    outputs=[y],
    name="test_onehot_without_axis",
)
```

</details>


### <a name="Optional"></a><a name="optional">**Optional**</a>

  Constructs an optional-type value containing either an empty optional of a certain type specified by the attribute,
  or a non-empty value containing the input element.

#### Version

This version of the operator has been available since version 15 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>type</tt> : type_proto</dt>
<dd>Type of the element in the optional output</dd>
</dl>

#### Inputs (0 - 1)

<dl>
<dt><tt>input</tt> (optional) : V</dt>
<dd>The input element.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : O</dt>
<dd>The optional output enclosing the input element.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>V</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128), seq(tensor(uint8)), seq(tensor(uint16)), seq(tensor(uint32)), seq(tensor(uint64)), seq(tensor(int8)), seq(tensor(int16)), seq(tensor(int32)), seq(tensor(int64)), seq(tensor(float16)), seq(tensor(float)), seq(tensor(double)), seq(tensor(string)), seq(tensor(bool)), seq(tensor(complex64)), seq(tensor(complex128))</dt>
<dd>Constrain input type to all tensor and sequence types.</dd>
<dt><tt>O</tt> : optional(seq(tensor(uint8))), optional(seq(tensor(uint16))), optional(seq(tensor(uint32))), optional(seq(tensor(uint64))), optional(seq(tensor(int8))), optional(seq(tensor(int16))), optional(seq(tensor(int32))), optional(seq(tensor(int64))), optional(seq(tensor(float16))), optional(seq(tensor(float))), optional(seq(tensor(double))), optional(seq(tensor(string))), optional(seq(tensor(bool))), optional(seq(tensor(complex64))), optional(seq(tensor(complex128))), optional(tensor(uint8)), optional(tensor(uint16)), optional(tensor(uint32)), optional(tensor(uint64)), optional(tensor(int8)), optional(tensor(int16)), optional(tensor(int32)), optional(tensor(int64)), optional(tensor(float16)), optional(tensor(float)), optional(tensor(double)), optional(tensor(string)), optional(tensor(bool)), optional(tensor(complex64)), optional(tensor(complex128))</dt>
<dd>Constrain output type to all optional tensor or optional sequence types.</dd>
</dl>


### <a name="OptionalGetElement"></a><a name="optionalgetelement">**OptionalGetElement**</a>

  If the input is a tensor or sequence type, it returns the input.
  If the input is an optional type, it outputs the element in the input.
  It is an error if the input is an empty optional-type (i.e. does not have an element) and the behavior is undefined in this case.

#### Version

This version of the operator has been available since version 18 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#OptionalGetElement-15">15</a>

#### Inputs

<dl>
<dt><tt>input</tt> : O</dt>
<dd>The optional input.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : V</dt>
<dd>Output element in the optional input.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>O</tt> : optional(seq(tensor(uint8))), optional(seq(tensor(uint16))), optional(seq(tensor(uint32))), optional(seq(tensor(uint64))), optional(seq(tensor(int8))), optional(seq(tensor(int16))), optional(seq(tensor(int32))), optional(seq(tensor(int64))), optional(seq(tensor(float16))), optional(seq(tensor(float))), optional(seq(tensor(double))), optional(seq(tensor(string))), optional(seq(tensor(bool))), optional(seq(tensor(complex64))), optional(seq(tensor(complex128))), optional(tensor(uint8)), optional(tensor(uint16)), optional(tensor(uint32)), optional(tensor(uint64)), optional(tensor(int8)), optional(tensor(int16)), optional(tensor(int32)), optional(tensor(int64)), optional(tensor(float16)), optional(tensor(float)), optional(tensor(double)), optional(tensor(string)), optional(tensor(bool)), optional(tensor(complex64)), optional(tensor(complex128)), tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128), seq(tensor(uint8)), seq(tensor(uint16)), seq(tensor(uint32)), seq(tensor(uint64)), seq(tensor(int8)), seq(tensor(int16)), seq(tensor(int32)), seq(tensor(int64)), seq(tensor(float16)), seq(tensor(float)), seq(tensor(double)), seq(tensor(string)), seq(tensor(bool)), seq(tensor(complex64)), seq(tensor(complex128))</dt>
<dd>Constrain input type to optional tensor and optional sequence types.</dd>
<dt><tt>V</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128), seq(tensor(uint8)), seq(tensor(uint16)), seq(tensor(uint32)), seq(tensor(uint64)), seq(tensor(int8)), seq(tensor(int16)), seq(tensor(int32)), seq(tensor(int64)), seq(tensor(float16)), seq(tensor(float)), seq(tensor(double)), seq(tensor(string)), seq(tensor(bool)), seq(tensor(complex64)), seq(tensor(complex128))</dt>
<dd>Constrain output type to all tensor or sequence types.</dd>
</dl>


### <a name="OptionalHasElement"></a><a name="optionalhaselement">**OptionalHasElement**</a>

  Returns true if (1) the input is an optional-type and contains an element,
  or, (2) the input is a tensor or sequence type.
  If the input is not provided or is an empty optional-type, this op returns false.

#### Version

This version of the operator has been available since version 18 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#OptionalHasElement-15">15</a>

#### Inputs (0 - 1)

<dl>
<dt><tt>input</tt> (optional) : O</dt>
<dd>The optional input.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : B</dt>
<dd>A scalar boolean tensor. If true, it indicates that optional-type input contains an element. Otherwise, it is empty.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>O</tt> : optional(seq(tensor(uint8))), optional(seq(tensor(uint16))), optional(seq(tensor(uint32))), optional(seq(tensor(uint64))), optional(seq(tensor(int8))), optional(seq(tensor(int16))), optional(seq(tensor(int32))), optional(seq(tensor(int64))), optional(seq(tensor(float16))), optional(seq(tensor(float))), optional(seq(tensor(double))), optional(seq(tensor(string))), optional(seq(tensor(bool))), optional(seq(tensor(complex64))), optional(seq(tensor(complex128))), optional(tensor(uint8)), optional(tensor(uint16)), optional(tensor(uint32)), optional(tensor(uint64)), optional(tensor(int8)), optional(tensor(int16)), optional(tensor(int32)), optional(tensor(int64)), optional(tensor(float16)), optional(tensor(float)), optional(tensor(double)), optional(tensor(string)), optional(tensor(bool)), optional(tensor(complex64)), optional(tensor(complex128)), tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128), seq(tensor(uint8)), seq(tensor(uint16)), seq(tensor(uint32)), seq(tensor(uint64)), seq(tensor(int8)), seq(tensor(int16)), seq(tensor(int32)), seq(tensor(int64)), seq(tensor(float16)), seq(tensor(float)), seq(tensor(double)), seq(tensor(string)), seq(tensor(bool)), seq(tensor(complex64)), seq(tensor(complex128))</dt>
<dd>Constrain input type to optional tensor and optional sequence types.</dd>
<dt><tt>B</tt> : tensor(bool)</dt>
<dd>Constrain output to a boolean tensor.</dd>
</dl>


#### Examples

<details>
<summary>empty</summary>

```python
optional = None

tensor_type_proto = onnx.helper.make_tensor_type_proto(
    elem_type=onnx.TensorProto.INT32, shape=[]
)
optional_type_proto = onnx.helper.make_optional_type_proto(tensor_type_proto)

# OptionalHasElement takes a tensor or optional as input
for input_type_proto in [tensor_type_proto, optional_type_proto]:
    input_name_options = {
        "empty": "optional_input",
        "empty_no_input_name": "",
        "empty_no_input": None,
    }
    for test_name_surfix, input_name in input_name_options.items():
        if input_type_proto == tensor_type_proto and input_name:
            # the input tensor cannot be empty if input name is provided.
            continue
        node = onnx.helper.make_node(
            "OptionalHasElement",
            inputs=[] if input_name is None else [input_name],
            outputs=["output"],
        )
        output = optional_has_element_reference_implementation(optional)
        test_name = (
            "test_optional_has_element_"
            + test_name_surfix
            + (
                "_optional_input"
                if input_type_proto == optional_type_proto
                else "_tensor_input"
            )
        )
        expect(
            node,
            inputs=[optional] if input_name else [],
            outputs=[output],
            input_type_protos=[input_type_proto] if input_name else [],
            name=test_name,
        )
```

</details>


<details>
<summary>get_element_sequence</summary>

```python
optional = [np.array([1, 2, 3, 4]).astype(np.int32)]
tensor_type_proto = onnx.helper.make_tensor_type_proto(
    elem_type=onnx.TensorProto.INT32,
    shape=[
        4,
    ],
)
seq_type_proto = onnx.helper.make_sequence_type_proto(tensor_type_proto)
optional_type_proto = onnx.helper.make_optional_type_proto(seq_type_proto)

node = onnx.helper.make_node(
    "OptionalGetElement", inputs=["optional_input"], outputs=["output"]
)
output = optional_get_element_reference_implementation(optional)
expect(
    node,
    inputs=[optional],
    outputs=[output],
    input_type_protos=[optional_type_proto],
    name="test_optional_get_element_optional_sequence",
)
expect(
    node,
    inputs=[optional],
    outputs=[output],
    input_type_protos=[seq_type_proto],
    name="test_optional_get_element_sequence",
)
```

</details>


<details>
<summary>get_element_tensor</summary>

```python
optional = np.array([1, 2, 3, 4]).astype(np.float32)
tensor_type_proto = onnx.helper.make_tensor_type_proto(
    elem_type=onnx.TensorProto.FLOAT,
    shape=[
        4,
    ],
)
optional_type_proto = onnx.helper.make_optional_type_proto(tensor_type_proto)

node = onnx.helper.make_node(
    "OptionalGetElement", inputs=["optional_input"], outputs=["output"]
)
output = optional_get_element_reference_implementation(optional)
expect(
    node,
    inputs=[optional],
    outputs=[output],
    input_type_protos=[optional_type_proto],
    name="test_optional_get_element_optional_tensor",
)
expect(
    node,
    inputs=[optional],
    outputs=[output],
    input_type_protos=[tensor_type_proto],
    name="test_optional_get_element_tensor",
)
```

</details>


<details>
<summary>optionalhaselement</summary>

```python
optional = np.array([1, 2, 3, 4]).astype(np.float32)
tensor_type_proto = onnx.helper.make_tensor_type_proto(
    elem_type=onnx.TensorProto.FLOAT,
    shape=[
        4,
    ],
)
optional_type_proto = onnx.helper.make_optional_type_proto(tensor_type_proto)

# OptionalHasElement takes a tensor or optional as input
for input_type_protos in [tensor_type_proto, optional_type_proto]:
    node = onnx.helper.make_node(
        "OptionalHasElement", inputs=["optional_input"], outputs=["output"]
    )
    output = optional_has_element_reference_implementation(optional)
    test_name = "test_optional_has_element_" + (
        "optional_input"
        if input_type_protos == optional_type_proto
        else "tensor_input"
    )
    expect(
        node,
        inputs=[optional],
        outputs=[output],
        input_type_protos=[optional_type_proto],
        name=test_name,
    )
```

</details>


### <a name="Or"></a><a name="or">**Or**</a>

  Returns the tensor resulted from performing the `or` logical operation
  elementwise on the input tensors `A` and `B` (with Numpy-style broadcasting support).

  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Or-1">1</a>

#### Inputs

<dl>
<dt><tt>A</tt> (non-differentiable) : T</dt>
<dd>First input operand for the logical operator.</dd>
<dt><tt>B</tt> (non-differentiable) : T</dt>
<dd>Second input operand for the logical operator.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> (non-differentiable) : T1</dt>
<dd>Result tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(bool)</dt>
<dd>Constrain input to boolean tensor.</dd>
<dt><tt>T1</tt> : tensor(bool)</dt>
<dd>Constrain output to boolean tensor.</dd>
</dl>


#### Examples

<details>
<summary>or</summary>

```python
node = onnx.helper.make_node(
    "Or",
    inputs=["x", "y"],
    outputs=["or"],
)

# 2d
x = (np.random.randn(3, 4) > 0).astype(bool)
y = (np.random.randn(3, 4) > 0).astype(bool)
z = np.logical_or(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_or2d")

# 3d
x = (np.random.randn(3, 4, 5) > 0).astype(bool)
y = (np.random.randn(3, 4, 5) > 0).astype(bool)
z = np.logical_or(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_or3d")

# 4d
x = (np.random.randn(3, 4, 5, 6) > 0).astype(bool)
y = (np.random.randn(3, 4, 5, 6) > 0).astype(bool)
z = np.logical_or(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_or4d")
```

</details>


<details>
<summary>or_broadcast</summary>

```python
node = onnx.helper.make_node(
    "Or",
    inputs=["x", "y"],
    outputs=["or"],
)

# 3d vs 1d
x = (np.random.randn(3, 4, 5) > 0).astype(bool)
y = (np.random.randn(5) > 0).astype(bool)
z = np.logical_or(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_or_bcast3v1d")

# 3d vs 2d
x = (np.random.randn(3, 4, 5) > 0).astype(bool)
y = (np.random.randn(4, 5) > 0).astype(bool)
z = np.logical_or(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_or_bcast3v2d")

# 4d vs 2d
x = (np.random.randn(3, 4, 5, 6) > 0).astype(bool)
y = (np.random.randn(5, 6) > 0).astype(bool)
z = np.logical_or(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_or_bcast4v2d")

# 4d vs 3d
x = (np.random.randn(3, 4, 5, 6) > 0).astype(bool)
y = (np.random.randn(4, 5, 6) > 0).astype(bool)
z = np.logical_or(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_or_bcast4v3d")

# 4d vs 4d
x = (np.random.randn(1, 4, 1, 6) > 0).astype(bool)
y = (np.random.randn(3, 1, 5, 6) > 0).astype(bool)
z = np.logical_or(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_or_bcast4v4d")
```

</details>


### <a name="PRelu"></a><a name="prelu">**PRelu**</a>

  PRelu takes input data (Tensor<T>) and slope tensor as input, and produces one
  output data (Tensor<T>) where the function `f(x) = slope * x for x < 0`,
  `f(x) = x for x >= 0`., is applied to the data tensor elementwise.
  This operator supports **unidirectional broadcasting** (tensor slope should be unidirectional broadcastable to input tensor X); for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 16 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#PRelu-1">1</a>, <a href="Changelog.md#PRelu-6">6</a>, <a href="Changelog.md#PRelu-7">7</a>, <a href="Changelog.md#PRelu-9">9</a>

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
<dt><tt>slope</tt> (differentiable) : T</dt>
<dd>Slope tensor. The shape of slope can be smaller than first input X; if so, its shape must be unidirectional broadcastable to X</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>Output tensor (same size as X)</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(uint32), tensor(uint64), tensor(int32), tensor(int64)</dt>
<dd>Constrain input and output types to float/int tensors.</dd>
</dl>


#### Examples

<details>
<summary>prelu</summary>

```python
node = onnx.helper.make_node(
    "PRelu",
    inputs=["x", "slope"],
    outputs=["y"],
)

x = np.random.randn(3, 4, 5).astype(np.float32)
slope = np.random.randn(3, 4, 5).astype(np.float32)
y = np.clip(x, 0, np.inf) + np.clip(x, -np.inf, 0) * slope

expect(node, inputs=[x, slope], outputs=[y], name="test_prelu_example")
```

</details>


<details>
<summary>prelu_broadcast</summary>

```python
node = onnx.helper.make_node(
    "PRelu",
    inputs=["x", "slope"],
    outputs=["y"],
)

x = np.random.randn(3, 4, 5).astype(np.float32)
slope = np.random.randn(5).astype(np.float32)
y = np.clip(x, 0, np.inf) + np.clip(x, -np.inf, 0) * slope

expect(node, inputs=[x, slope], outputs=[y], name="test_prelu_broadcast")
```

</details>


### <a name="Pad"></a><a name="pad">**Pad**</a>

  Given a tensor containing the data to be padded (`data`), a tensor containing the number of start and end pad values for axis (`pads`), (optionally) a `mode`, and (optionally) `constant_value`,
  a padded tensor (`output`) is generated.

  The three supported `modes` are (similar to corresponding modes supported by `numpy.pad`):

  1) `constant`(default) - pads with a given constant value as specified by `constant_value` (which defaults to 0, empty string, or False)

  2) `reflect` - pads with the reflection of the vector mirrored on the first and last values of the vector along each axis

  3) `edge` - pads with the edge values of array

  4) `wrap` - wrap-around padding as if the data tensor forms a torus


  Example 1 (`constant` mode):

  Insert 0 pads to the beginning of the second dimension.

  ```
  data = [
      [1.0, 1.2],
      [2.3, 3.4],
      [4.5, 5.7],
  ]

  pads = [0, 2, 0, 0]

  mode = 'constant'

  constant_value = 0.0

  output = [
      [0.0, 0.0, 1.0, 1.2],
      [0.0, 0.0, 2.3, 3.4],
      [0.0, 0.0, 4.5, 5.7],
  ]
  ```

  Example 2 (`reflect` mode):

  ```
  data = [
      [1.0, 1.2],
      [2.3, 3.4],
      [4.5, 5.7],
  ]

  pads = [0, 2, 0, 0]

  mode = 'reflect'

  output = [
      [1.0, 1.2, 1.0, 1.2],
      [2.3, 3.4, 2.3, 3.4],
      [4.5, 5.7, 4.5, 5.7],
  ]
  ```

  Example 3 (`edge` mode):

  ```
  data = [
      [1.0, 1.2],
      [2.3, 3.4],
      [4.5, 5.7],
  ]

  pads = [0, 2, 0, 0]

  mode = 'edge'

  output = [
      [1.0, 1.0, 1.0, 1.2],
      [2.3, 2.3, 2.3, 3.4],
      [4.5, 4.5, 4.5, 5.7],
  ]
  ```

  Example 4 (`wrap` mode):

  ```
  data = [
      [1.0, 1.2],
      [2.3, 3.4],
      [4.5, 5.7],
  ]

  pads = [2, 1, 1, 1]

  mode = 'wrap'

  output = [
      [3.4, 2.3, 3.4, 2.3],
      [5.7, 4.5, 5.7, 4.5],
      [1.2, 1.0, 1.2, 1.0],
      [3.4, 2.3, 3.4, 2.3],
      [5.7, 4.5, 5.7, 4.5],
      [1.2, 1.0, 1.2, 1.0],
  ]
  ```

#### Version

This version of the operator has been available since version 21 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Pad-1">1</a>, <a href="Changelog.md#Pad-2">2</a>, <a href="Changelog.md#Pad-11">11</a>, <a href="Changelog.md#Pad-13">13</a>, <a href="Changelog.md#Pad-18">18</a>, <a href="Changelog.md#Pad-19">19</a>

#### Attributes

<dl>
<dt><tt>mode</tt> : string (default is constant)</dt>
<dd>Supported modes: `constant`(default), `reflect`, `edge`, `wrap`</dd>
</dl>

#### Inputs (2 - 4)

<dl>
<dt><tt>data</tt> (differentiable) : T</dt>
<dd>Input tensor.</dd>
<dt><tt>pads</tt> (non-differentiable) : tensor(int64)</dt>
<dd>Tensor of integers indicating the number of padding elements to add or remove (if negative) at the beginning and end of each axis. For 2D input tensor, it is the number of pixels. `pads` should be a 1D tensor of shape [2 * num_axes] where `num_axes` refers to the number of elements in the `axes` input or the input rank if `axes` are not provided explicitly. `pads` format should be: [x1_begin, x2_begin, ..., x1_end, x2_end,...], where xi_begin is the number of pad values added at the beginning of axis `axes[i]` and xi_end, the number of pad values added at the end of axis `axes[i]`.</dd>
<dt><tt>constant_value</tt> (optional, non-differentiable) : T</dt>
<dd>(Optional) A scalar value to be used if the mode chosen is `constant` (by default it is 0, empty string or False).</dd>
<dt><tt>axes</tt> (optional, non-differentiable) : Tind</dt>
<dd>1-D tensor of axes that `pads` apply to. Negative value means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(data). Behavior is undefined if an axis is repeated. If not provided, all axes are assumed (`[0, 1, ..., input_rank-1]`).</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>Tensor after padding.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4)</dt>
<dd>Constrain input and output types to all tensor types up to IRv10.</dd>
<dt><tt>Tind</tt> : tensor(int32), tensor(int64)</dt>
<dd>Constrain indices to integer types</dd>
</dl>


#### Examples

<details>
<summary>constant_pad</summary>

```python
node = onnx.helper.make_node(
    "Pad", inputs=["x", "pads", "value"], outputs=["y"], mode="constant"
)
x = np.random.randn(1, 3, 4, 5).astype(np.float32)
pads = np.array([0, 0, 1, 3, 0, 0, 2, 4]).astype(
    np.int64
)  # pad order [x1_begin, x2_begin, ..., x1_end, x2_end, ...]
value = np.float32(1.2)
y = pad_impl(x, pads, "constant", 1.2)

expect(node, inputs=[x, pads, value], outputs=[y], name="test_constant_pad")
```

</details>


<details>
<summary>constant_pad_axes</summary>

```python
node = onnx.helper.make_node(
    "Pad", inputs=["x", "pads", "value", "axes"], outputs=["y"], mode="constant"
)
x = np.random.randn(1, 3, 4, 5).astype(np.float32)
pads = np.array([0, 3, 0, 4]).astype(
    np.int64
)  # pad order [x1_begin, x2_begin, ..., x1_end, x2_end, ...]
value = np.float32(1.2)
axes = np.array([1, 3], dtype=np.int64)
y = pad_impl(
    x,
    pads,
    "constant",
    1.2,
    [1, 3],
)

expect(
    node,
    inputs=[x, pads, value, axes],
    outputs=[y],
    name="test_constant_pad_axes",
)
```

</details>


<details>
<summary>constant_pad_negative_axes</summary>

```python
node = onnx.helper.make_node(
    "Pad", inputs=["x", "pads", "value", "axes"], outputs=["y"], mode="constant"
)
x = np.random.randn(1, 3, 4, 5).astype(np.float32)
pads = np.array([0, 3, 0, 4]).astype(
    np.int64
)  # pad order [x1_begin, x2_begin, ..., x1_end, x2_end, ...]
value = np.float32(1.2)
axes = np.array([-3, -1], dtype=np.int64)
y = pad_impl(
    x,
    pads,
    "constant",
    1.2,
    [-3, -1],
)

expect(
    node,
    inputs=[x, pads, value, axes],
    outputs=[y],
    name="test_constant_pad_negative_axes",
)
```

</details>


<details>
<summary>reflection_edge_and_wrap_pad</summary>

```python
for mode in ("edge", "reflect", "wrap"):
    node = onnx.helper.make_node(
        "Pad", inputs=["x", "pads"], outputs=["y"], mode=mode
    )
    x = np.random.randn(1, 3, 4, 5).astype(np.int32)
    pads = np.array([0, 0, 1, 1, 0, 0, 1, 1]).astype(
        np.int64
    )  # pad order [x1_begin, x2_begin, ..., x1_end, x2_end, ...]
    y = pad_impl(x, pads, mode)

    expect(node, inputs=[x, pads], outputs=[y], name=f"test_{mode}_pad")
```

</details>


### <a name="Pow"></a><a name="pow">**Pow**</a>

  Pow takes input data (Tensor<T>) and exponent Tensor, and
  produces one output data (Tensor<T>) where the function `f(x) = x^exponent`,
  is applied to the data tensor elementwise.
  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 15 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Pow-1">1</a>, <a href="Changelog.md#Pow-7">7</a>, <a href="Changelog.md#Pow-12">12</a>, <a href="Changelog.md#Pow-13">13</a>

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>First operand, base of the exponent.</dd>
<dt><tt>Y</tt> (differentiable) : T1</dt>
<dd>Second operand, power of the exponent.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Z</tt> (differentiable) : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input X and output types to float/int tensors.</dd>
<dt><tt>T1</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input Y types to float/int tensors.</dd>
</dl>


#### Examples

<details>
<summary>pow</summary>

```python
node = onnx.helper.make_node(
    "Pow",
    inputs=["x", "y"],
    outputs=["z"],
)

x = np.array([1, 2, 3]).astype(np.float32)
y = np.array([4, 5, 6]).astype(np.float32)
z = pow(x, y)  # expected output [1., 32., 729.]
expect(node, inputs=[x, y], outputs=[z], name="test_pow_example")

x = np.arange(60).reshape(3, 4, 5).astype(np.float32)
y = np.random.randn(3, 4, 5).astype(np.float32)
z = pow(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_pow")
```

</details>


<details>
<summary>pow_broadcast</summary>

```python
node = onnx.helper.make_node(
    "Pow",
    inputs=["x", "y"],
    outputs=["z"],
)

x = np.array([1, 2, 3]).astype(np.float32)
y = np.array(2).astype(np.float32)
z = pow(x, y)  # expected output [1., 4., 9.]
expect(node, inputs=[x, y], outputs=[z], name="test_pow_bcast_scalar")

node = onnx.helper.make_node(
    "Pow",
    inputs=["x", "y"],
    outputs=["z"],
)
x = np.array([[1, 2, 3], [4, 5, 6]]).astype(np.float32)
y = np.array([1, 2, 3]).astype(np.float32)
# expected output [[1, 4, 27], [4, 25, 216]]
z = pow(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_pow_bcast_array")
```

</details>


<details>
<summary>types</summary>

```python
node = onnx.helper.make_node(
    "Pow",
    inputs=["x", "y"],
    outputs=["z"],
)

x = np.array([1, 2, 3]).astype(np.float32)
y = np.array([4, 5, 6]).astype(np.int64)
z = pow(x, y)  # expected output [1., 32., 729.]
expect(node, inputs=[x, y], outputs=[z], name="test_pow_types_float32_int64")

x = np.array([1, 2, 3]).astype(np.int64)
y = np.array([4, 5, 6]).astype(np.float32)
z = pow(x, y)  # expected output [1, 32, 729]
expect(node, inputs=[x, y], outputs=[z], name="test_pow_types_int64_float32")

x = np.array([1, 2, 3]).astype(np.float32)
y = np.array([4, 5, 6]).astype(np.int32)
z = pow(x, y)  # expected output [1., 32., 729.]
expect(node, inputs=[x, y], outputs=[z], name="test_pow_types_float32_int32")

x = np.array([1, 2, 3]).astype(np.int32)
y = np.array([4, 5, 6]).astype(np.float32)
z = pow(x, y)  # expected output [1, 32, 729]
expect(node, inputs=[x, y], outputs=[z], name="test_pow_types_int32_float32")

x = np.array([1, 2, 3]).astype(np.float32)
y = np.array([4, 5, 6]).astype(np.uint64)
z = pow(x, y)  # expected output [1., 32., 729.]
expect(node, inputs=[x, y], outputs=[z], name="test_pow_types_float32_uint64")

x = np.array([1, 2, 3]).astype(np.float32)
y = np.array([4, 5, 6]).astype(np.uint32)
z = pow(x, y)  # expected output [1., 32., 729.]
expect(node, inputs=[x, y], outputs=[z], name="test_pow_types_float32_uint32")

x = np.array([1, 2, 3]).astype(np.int64)
y = np.array([4, 5, 6]).astype(np.int64)
z = pow(x, y)  # expected output [1, 32, 729]
expect(node, inputs=[x, y], outputs=[z], name="test_pow_types_int64_int64")

x = np.array([1, 2, 3]).astype(np.int32)
y = np.array([4, 5, 6]).astype(np.int32)
z = pow(x, y)  # expected output [1, 32, 729]
expect(node, inputs=[x, y], outputs=[z], name="test_pow_types_int32_int32")
```

</details>


### <a name="QLinearConv"></a><a name="qlinearconv">**QLinearConv**</a>

  The convolution operator consumes a quantized input tensor, its scale and zero point,
  a quantized filter, its scale and zero point, and output's scale and zero point,
  and computes the quantized output. Each scale and zero-point pair must have same shape.
  It means they must be either scalars (per tensor) or 1-D tensors (per output channel).
  Each input or output and its related zero point must have same type.
  When bias is present it must be quantized using scale = input scale * weight scale and
  zero point as 0.

#### Version

This version of the operator has been available since version 10 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>auto_pad</tt> : string (default is NOTSET)</dt>
<dd>auto_pad must be either NOTSET, SAME_UPPER, SAME_LOWER or VALID. Where default value is NOTSET, which means explicit padding is used. SAME_UPPER or SAME_LOWER mean pad the input so that `output_shape[i] = ceil(input_shape[i] / strides[i])` for each axis `i`. The padding is split between the two sides equally or almost equally (depending on whether it is even or odd). In case the padding is an odd number, the extra padding is added at the end for SAME_UPPER and at the beginning for SAME_LOWER.</dd>
<dt><tt>dilations</tt> : list of ints</dt>
<dd>dilation value along each spatial axis of the filter. If not present, the dilation defaults to 1 along each spatial axis.</dd>
<dt><tt>group</tt> : int (default is 1)</dt>
<dd>number of groups input channels and output channels are divided into. default is 1.</dd>
<dt><tt>kernel_shape</tt> : list of ints</dt>
<dd>The shape of the convolution kernel. If not present, should be inferred from input 'w'.</dd>
<dt><tt>pads</tt> : list of ints</dt>
<dd>Padding for the beginning and ending along each spatial axis, it can take any value greater than or equal to 0.The value represent the number of pixels added to the beginning and end part of the corresponding axis.`pads` format should be as follow [x1_begin, x2_begin...x1_end, x2_end,...], where xi_begin the number ofpixels added at the beginning of axis `i` and xi_end, the number of pixels added at the end of axis `i`.This attribute cannot be used simultaneously with auto_pad attribute. If not present, the padding defaultsto 0 along start and end of each spatial axis.</dd>
<dt><tt>strides</tt> : list of ints</dt>
<dd>Stride along each spatial axis. If not present, the stride defaults to 1 along each spatial axis.</dd>
</dl>

#### Inputs (8 - 9)

<dl>
<dt><tt>x</tt> : T1</dt>
<dd>Input data tensor from previous layer; has size (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and width. Note that this is for the 2D image. Otherwise the size is (N x C x D1 x D2 ... x Dn). Optionally, if dimension denotation is in effect, the operation expects input data tensor to arrive with the dimension denotation of [DATA_BATCH, DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE ...].</dd>
<dt><tt>x_scale</tt> : tensor(float)</dt>
<dd>Scale tensor for input 'x'. It's a scalar, which means a per-tensor/layer quantization.</dd>
<dt><tt>x_zero_point</tt> : T1</dt>
<dd>Zero point tensor for input 'x'. It's a scalar, which means a per-tensor/layer quantization.</dd>
<dt><tt>w</tt> : T2</dt>
<dd>The weight tensor that will be used in the convolutions; has size (M x C/group x kH x kW), where C is the number of channels, and kH and kW are the height and width of the kernel, and M is the number of feature maps. For more than 2 dimensions, the kernel shape will be (M x C/group x k1 x k2 x ... x kn), where (k1 x k2 x ... kn) is the dimension of the kernel. Optionally, if dimension denotation is in effect, the operation expects the weight tensor to arrive with the dimension denotation of [FILTER_OUT_CHANNEL, FILTER_IN_CHANNEL, FILTER_SPATIAL, FILTER_SPATIAL ...]. X.shape[1] == (W.shape[1] * group) == C (assuming zero based indices for the shape array). Or in other words FILTER_IN_CHANNEL should be equal to DATA_CHANNEL. </dd>
<dt><tt>w_scale</tt> : tensor(float)</dt>
<dd>Scale tensor for input 'w'. It could be a scalar or a 1-D tensor, which means a per-tensor/layer or per output channel quantization. If it's a 1-D tensor, its number of elements should be equal to the number of output channels (M).</dd>
<dt><tt>w_zero_point</tt> : T2</dt>
<dd>Zero point tensor for input 'w'. It could be a scalar or a 1-D tensor, which means a per-tensor/layer or per output channel quantization. If it's a 1-D tensor, its number of elements should be equal to the number of output channels (M).</dd>
<dt><tt>y_scale</tt> : tensor(float)</dt>
<dd>Scale tensor for output 'y'. It's a scalar, which means a per-tensor/layer quantization.</dd>
<dt><tt>y_zero_point</tt> : T3</dt>
<dd>Zero point tensor for output 'y'. It's a scalar, which means a per-tensor/layer quantization.</dd>
<dt><tt>B</tt> (optional) : T4</dt>
<dd>Optional 1D bias to be added to the convolution, has size of M. Bias must be quantized using scale = x_scale * w_scale and zero_point = 0</dd>
</dl>

#### Outputs

<dl>
<dt><tt>y</tt> : T3</dt>
<dd>Output data tensor that contains the result of the convolution. The output dimensions are functions of the kernel size, stride size, and pad lengths.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(int8), tensor(uint8)</dt>
<dd>Constrain input type to 8-bit integer tensor.</dd>
<dt><tt>T2</tt> : tensor(int8), tensor(uint8)</dt>
<dd>Constrain filter type to 8-bit integer tensor.</dd>
<dt><tt>T3</tt> : tensor(int8), tensor(uint8)</dt>
<dd>Constrain output type to 8-bit integer tensor.</dd>
<dt><tt>T4</tt> : tensor(int32)</dt>
<dd>Constrain bias type to 32-bit integer tensor.</dd>
</dl>


#### Examples

<details>
<summary>qlinearconv</summary>

```python
node = onnx.helper.make_node(
    "QLinearConv",
    inputs=[
        "x",
        "x_scale",
        "x_zero_point",
        "w",
        "w_scale",
        "w_zero_point",
        "y_scale",
        "y_zero_point",
    ],
    outputs=["y"],
)

x = np.array(
    [
        [255, 174, 162, 25, 203, 168, 58],
        [15, 59, 237, 95, 129, 0, 64],
        [56, 242, 153, 221, 168, 12, 166],
        [232, 178, 186, 195, 237, 162, 237],
        [188, 39, 124, 77, 80, 102, 43],
        [127, 230, 21, 83, 41, 40, 134],
        [255, 154, 92, 141, 42, 148, 247],
    ],
    dtype=np.uint8,
).reshape((1, 1, 7, 7))

x_scale = np.float32(0.00369204697)
x_zero_point = np.uint8(132)

w = np.array([0], dtype=np.uint8).reshape((1, 1, 1, 1))

w_scale = np.array([0.00172794575], dtype=np.float32)
w_zero_point = np.array([255], dtype=np.uint8)

y_scale = np.float32(0.00162681262)
y_zero_point = np.uint8(123)

output = np.array(
    [
        [0, 81, 93, 230, 52, 87, 197],
        [240, 196, 18, 160, 126, 255, 191],
        [199, 13, 102, 34, 87, 243, 89],
        [23, 77, 69, 60, 18, 93, 18],
        [67, 216, 131, 178, 175, 153, 212],
        [128, 25, 234, 172, 214, 215, 121],
        [0, 101, 163, 114, 213, 107, 8],
    ],
    dtype=np.uint8,
).reshape((1, 1, 7, 7))

expect(
    node,
    inputs=[
        x,
        x_scale,
        x_zero_point,
        w,
        w_scale,
        w_zero_point,
        y_scale,
        y_zero_point,
    ],
    outputs=[output],
    name="test_qlinearconv",
)
```

</details>


### <a name="QLinearMatMul"></a><a name="qlinearmatmul">**QLinearMatMul**</a>

  Matrix product that behaves like numpy.matmul: https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.matmul.html.
  It consumes two quantized input tensors, their scales and zero points, scale and zero point of output,
  and computes the quantized output. The quantization formula is y = saturate((x / y_scale) + y_zero_point).
  For (x / y_scale), it is rounding to nearest ties to even. Refer to https://en.wikipedia.org/wiki/Rounding for details.
  Scale and zero point must have same shape. They must be either scalar (per tensor) or N-D tensor
  (per row for 'a' and per column for 'b'). Scalar refers to per tensor quantization whereas N-D refers to per row
  or per column quantization. If the input is 2D of shape [M, K] then zero point and scale tensor may be
  an M element vector [v_1, v_2, ..., v_M] for per row quantization and K element vector of shape [v_1, v_2, ..., v_K]
  for per column quantization. If the input is N-D tensor with shape [D1, D2, M, K] then zero point and scale tensor may
  have shape [D1, D2, M, 1] for per row quantization and shape [D1, D2, 1, K] for per column quantization.
  Production must never overflow, and accumulation may overflow if and only if in 32 bits.

#### Version

This version of the operator has been available since version 21 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#QLinearMatMul-10">10</a>

#### Inputs

<dl>
<dt><tt>a</tt> (non-differentiable) : T1</dt>
<dd>N-dimensional quantized matrix a</dd>
<dt><tt>a_scale</tt> (non-differentiable) : TS</dt>
<dd>scale of quantized input a</dd>
<dt><tt>a_zero_point</tt> (non-differentiable) : T1</dt>
<dd>zero point of quantized input a</dd>
<dt><tt>b</tt> (non-differentiable) : T2</dt>
<dd>N-dimensional quantized matrix b</dd>
<dt><tt>b_scale</tt> (non-differentiable) : TS</dt>
<dd>scale of quantized input b</dd>
<dt><tt>b_zero_point</tt> (non-differentiable) : T2</dt>
<dd>zero point of quantized input b</dd>
<dt><tt>y_scale</tt> (non-differentiable) : TS</dt>
<dd>scale of quantized output y</dd>
<dt><tt>y_zero_point</tt> (non-differentiable) : T3</dt>
<dd>zero point of quantized output y</dd>
</dl>

#### Outputs

<dl>
<dt><tt>y</tt> (non-differentiable) : T3</dt>
<dd>Quantized matrix multiply results from a * b</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>TS</tt> : tensor(float), tensor(float16), tensor(bfloat16)</dt>
<dd>Constrain scales.</dd>
<dt><tt>T1</tt> : tensor(int8), tensor(uint8), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz)</dt>
<dd>The type of input a and its zeropoint.</dd>
<dt><tt>T2</tt> : tensor(int8), tensor(uint8), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz)</dt>
<dd>The type of input b and its zeropoint.</dd>
<dt><tt>T3</tt> : tensor(int8), tensor(uint8), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz)</dt>
<dd>The type of the output and its zeropoint.</dd>
</dl>


#### Examples

<details>
<summary>int</summary>

```python
for quant_type_name in ["uint8", "int8"]:
    quant_type = getattr(np, quant_type_name)
    for dtype_name in ["float32", "float16"]:
        dtype = getattr(np, dtype_name)
        node = onnx.helper.make_node(
            "QLinearMatMul",
            inputs=[
                "a",
                "a_scale",
                "a_zero_point",
                "b",
                "b_scale",
                "b_zero_point",
                "y_scale",
                "y_zero_point",
            ],
            outputs=["y"],
        )

        # 2D
        a = np.array([[208, 236, 0, 238], [3, 214, 255, 29]])
        if quant_type == np.int8:
            a -= 127
        a = a.astype(quant_type)

        a_scale = np.array([0.0066], dtype=dtype)
        a_zero_point = np.array(
            [113 - 127] if quant_type == np.int8 else [113], dtype=quant_type
        )

        b = np.array(
            [[152, 51, 244], [60, 26, 255], [0, 127, 246], [127, 254, 247]]
        )
        if quant_type == np.int8:
            b -= 127
        b = b.astype(quant_type)

        b_scale = np.array([0.00705], dtype=dtype)
        b_zero_point = np.array(
            [114 - 127] if quant_type == np.int8 else [114], dtype=quant_type
        )

        y_scale = np.array([0.0107], dtype=dtype)
        y_zero_point = np.array(
            [118 - 127] if quant_type == np.int8 else [118], dtype=quant_type
        )

        if quant_type == np.int8:
            output = np.array([[41, -12, -9], [1, -75, 20]])
        else:
            output = np.array([[168, 115, 255], [1, 66, 151]])
        output = output.astype(quant_type)

        expect(
            node,
            inputs=[
                a,
                a_scale,
                a_zero_point,
                b,
                b_scale,
                b_zero_point,
                y_scale,
                y_zero_point,
            ],
            outputs=[output],
            name=f"test_qlinearmatmul_2D_{quant_type_name}_{dtype_name}",
        )

        # 3D
        a = np.array(
            [
                [[208, 236, 0, 238], [3, 214, 255, 29]],
                [[208, 236, 0, 238], [3, 214, 255, 29]],
            ],
        )
        if quant_type == np.int8:
            a -= 127
        a = a.astype(quant_type)

        a_scale = np.array([0.0066], dtype=dtype)
        a_zero_point = np.array(
            [113 - 127] if quant_type == np.int8 else [113], dtype=quant_type
        )

        b = np.array(
            [
                [[152, 51, 244], [60, 26, 255], [0, 127, 246], [127, 254, 247]],
                [[152, 51, 244], [60, 26, 255], [0, 127, 246], [127, 254, 247]],
            ],
        )
        if quant_type == np.int8:
            b -= 127
        b = b.astype(quant_type)

        b_scale = np.array([0.00705], dtype=dtype)
        b_zero_point = np.array([114], dtype=quant_type)

        y_scale = np.array([0.0107], dtype=dtype)
        y_zero_point = np.array(
            [118 - 127] if quant_type == np.int8 else [118], dtype=quant_type
        )

        if quant_type == np.int8:
            if dtype == np.float32:
                output = np.array(
                    [
                        [[-86, 117, 120], [115, 39, -121]],
                        [[-86, 117, 120], [115, 39, -121]],
                    ]
                )
            else:
                output = np.array(
                    [
                        [[-86, 116, 119], [115, 39, -121]],
                        [[-86, 116, 119], [115, 39, -121]],
                    ]
                )
        else:
            output = np.array(
                [
                    [[168, 115, 255], [1, 66, 151]],
                    [[168, 115, 255], [1, 66, 151]],
                ]
            )
        output = output.astype(quant_type)

        expect(
            node,
            inputs=[
                a,
                a_scale,
                a_zero_point,
                b,
                b_scale,
                b_zero_point,
                y_scale,
                y_zero_point,
            ],
            outputs=[output],
            name=f"test_qlinearmatmul_3D_{quant_type_name}_{dtype_name}",
        )
```

</details>


### <a name="QuantizeLinear"></a><a name="quantizelinear">**QuantizeLinear**</a>

  The linear quantization operator consumes a high-precision tensor, a scale, and a zero point to compute the
  low-precision/quantized tensor. The scale factor and zero point must have the same shape, determining the quantization
  granularity. The quantization formula is `y = saturate((x / y_scale) + y_zero_point)`.

  Saturation is done according to:
  - uint16: [0, 65535]
  - int16: [-32768, 32767]
  - uint8: [0, 255]
  - int8: [-128, 127]
  - uint4: [0, 15]
  - int4: [-8, 7]

  For `(x / y_scale)`, it rounds to the nearest even. Refer to https://en.wikipedia.org/wiki/Rounding for details.

  `y_zero_point` and `y` must have the same type. `y_zero_point` is usually not used for quantization to float8 types, but the quantization
  formula remains the same for consistency, and the type of the attribute `y_zero_point` still determines the quantization type.

  There are three supported quantization granularities, determined by the shape of `y_scale`.
  In all cases, `y_zero_point` must have the same shape as `y_scale`.
  - Per-tensor (per-layer) quantization: `y_scale` is a scalar.
  - Per-axis quantization: The scale must be a 1-D tensor, with the length of the quantization axis. For an input shape
   `(D0, ..., Di, ..., Dn)` and `axis=i`, `y_scale` is a 1-D tensor of length `Di`.
  - Blocked quantization: The scale's shape is identical to the input's shape, except for one dimension, in which
    blocking is performed. Given `x` shape `(D0, ..., Di, ..., Dn)`, `axis=i`, and block size `B`: `y_scale` shape is
    `(D0, ..., ceil(Di/B), ..., Dn)`.

#### Version

This version of the operator has been available since version 21 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#QuantizeLinear-10">10</a>, <a href="Changelog.md#QuantizeLinear-13">13</a>, <a href="Changelog.md#QuantizeLinear-19">19</a>

#### Attributes

<dl>
<dt><tt>axis</tt> : int (default is 1)</dt>
<dd>(Optional) The axis of the dequantizing dimension of the input tensor. Used for per-axis and blocked quantization. Negative value means counting dimensions from the back. Accepted range is `[-r, r-1]` where `r = rank(input)`.</dd>
<dt><tt>block_size</tt> : int (default is 0)</dt>
<dd>(Optional) The size of the quantization block (number of times every scale is replicated). Used only for blocked quantization. The block size is a positive integer. Given `x` shape `(D0, ..., Di, ..., Dn)`, `y_scale` shape `(S0, ... Si, ...Sn)` and `axis=i`, the accepted range is `[ceil(Di/Si), ceil(Di/(Si-1))-1]`</dd>
<dt><tt>output_dtype</tt> : int (default is 0)</dt>
<dd>(Optional) The output data type. If not supplied, the output data type is inferred from `y_zero_point` data type (`T2`). If neither `output_dtype` nor `y_zero_point` are supplied, output data type is uint8. If both `output_dtype` and `y_zero_point` are specified, `output_dtype` must be `T2`.</dd>
<dt><tt>saturate</tt> : int (default is 1)</dt>
<dd>The parameter defines how the conversion behaves if an input value is out of range of the destination type. It only applies for float 8 quantization (float8e4m3fn, float8e4m3fnuz, float8e5m2, float8e5m2fnuz). It is true by default. All cases are fully described in two tables inserted in the operator description.</dd>
</dl>

#### Inputs (2 - 3)

<dl>
<dt><tt>x</tt> : T1</dt>
<dd>N-D full precision Input tensor to be quantized.</dd>
<dt><tt>y_scale</tt> : T1</dt>
<dd>Scale for doing quantization to get `y`. For per-tensor/layer quantization the scale is a scalar, for per-axis quantization it is a 1-D Tensor and for blocked quantization it has the same shape as the input, except for one dimension in which blocking is performed.</dd>
<dt><tt>y_zero_point</tt> (optional) : T2</dt>
<dd>Zero point for doing quantization to get `y`. Shape must match `y_scale`.Default is uint8 with zero point of 0 if it's not specified.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>y</tt> : T2</dt>
<dd>N-D quantized output tensor. It has same shape as input `x`.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(float), tensor(float16), tensor(bfloat16), tensor(int32)</dt>
<dd>The type of the input 'x'.</dd>
<dt><tt>T2</tt> : tensor(int8), tensor(uint8), tensor(int16), tensor(uint16), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4)</dt>
<dd>The type of the input `y_zero_point` and the output `y`.</dd>
</dl>


#### Examples

<details>
<summary>axis</summary>

```python
node = onnx.helper.make_node(
    "QuantizeLinear",
    inputs=["x", "y_scale", "y_zero_point"],
    outputs=["y"],
)

x = np.array(
    [
        [
            [[-162, 10], [-100, 232], [-20, -50]],
            [[-76, 0], [0, 252], [32, -44]],
            [[245, -485], [-960, -270], [-375, -470]],
        ],
    ],
    dtype=np.float32,
)
y_scale = np.array([2, 4, 5], dtype=np.float32)
y_zero_point = np.array([84, 24, 196], dtype=np.uint8)
y = (x / y_scale.reshape(1, 3, 1, 1) + y_zero_point.reshape(1, 3, 1, 1)).astype(
    np.uint8
)

expect(
    node,
    inputs=[x, y_scale, y_zero_point],
    outputs=[y],
    name="test_quantizelinear_axis",
)
```

</details>


<details>
<summary>blocked_asymmetric</summary>

```python
node = onnx.helper.make_node(
    "QuantizeLinear",
    inputs=["x", "y_scale", "y_zero_point"],
    outputs=["y"],
    axis=1,
    block_size=2,
)

x = np.array(
    [
        [6.0, 12.0, 50.0, 5.0],
        [1.0, 8.0, 4.0, 5.0],
        [0.0, 20.0, 10.0, 4.0],
    ],
    dtype=np.float32,
)
y_scale = np.array(
    [
        [1.5, 2.5],
        [3.0, 4.9],
        [5.1, 6.9],
    ],
    dtype=np.float32,
)
y_zero_point = np.array(
    [
        [0, 1],
        [1, 0],
        [2, 3],
    ],
    dtype=np.uint8,
)
# x.shape = (3, 4)
# y_scale.shape = (3, 2)
assert y_scale.shape == y_zero_point.shape
block_axis = 1
# The block shape is [x.shape[i] // y_scale.shape[i] for i in range(len(x.shape))] = (1, 2)
assert all(
    x.shape[i] == y_scale.shape[i]
    for i in range(len(x.shape))
    if i != block_axis
)
assert x.shape[block_axis] % y_scale.shape[block_axis] == 0
repeats = x.shape[block_axis] // y_scale.shape[block_axis]

# Create element-wise scale and zero point
y_scale_elementwise = np.repeat(y_scale, repeats=repeats, axis=block_axis)
y_zero_point_elementwise = np.repeat(
    y_zero_point, repeats=repeats, axis=block_axis
)

y = np.rint(x / y_scale_elementwise + y_zero_point_elementwise).astype(np.uint8)

expect(
    node,
    inputs=[x, y_scale, y_zero_point],
    outputs=[y],
    name="test_quantizelinear_blocked_asymmetric",
)
```

</details>


<details>
<summary>blocked_symmetric</summary>

```python
node = onnx.helper.make_node(
    "QuantizeLinear",
    inputs=["x", "y_scale"],
    outputs=["y"],
    axis=1,
    block_size=2,
    output_dtype=TensorProto.INT16,
)

x = np.array(
    [
        [6.0, -8, -10, 5.0],
        [1.0, 8.0, 4.0, 5.0],
        [0.0, 20.0, 10.0, 4.0],
    ],
    dtype=np.float32,
)

y_scale = np.array(
    [
        [1.5, 2.5],
        [3.0, 4.9],
        [5.1, 6.9],
    ],
    dtype=np.float32,
)

# x.shape = (3, 4)
# y_scale.shape = (3, 2)

block_axis = 1
# The block shape is [x.shape[i] // y_scale.shape[i] for i in range(len(x.shape))] = (1, 2)
assert all(
    x.shape[i] == y_scale.shape[i]
    for i in range(len(x.shape))
    if i != block_axis
)
assert x.shape[block_axis] % y_scale.shape[block_axis] == 0
repeats = x.shape[block_axis] // y_scale.shape[block_axis]

# Create element-wise scale and zero point
y_scale_elementwise = np.repeat(y_scale, repeats=repeats, axis=block_axis)

y_val = np.clip(
    np.rint(x / y_scale_elementwise), a_min=-32768, a_max=32767
).astype(np.int16)
y = make_tensor(
    "y",
    TensorProto.INT16,
    x.shape,
    y_val,
)
expect(
    node,
    inputs=[x, y_scale],
    outputs=[y],
    name="test_quantizelinear_blocked_symmetric",
)
```

</details>


<details>
<summary>e4m3fn</summary>

```python
node = onnx.helper.make_node(
    "QuantizeLinear",
    inputs=["x", "y_scale", "y_zero_point"],
    outputs=["y"],
)

x = np.array([0.0, 1.0, 2.0, 100000.0, 200.0]).astype(np.float32)
y_scale = np.float32(2)
y_zero_point = make_tensor("y_zero_point", TensorProto.FLOAT8E4M3FN, [1], [0])
y = make_tensor("y", TensorProto.FLOAT8E4M3FN, [5], [0, 0.5, 1, 448, 96])

expect(
    node,
    inputs=[x, y_scale, y_zero_point],
    outputs=[y],
    name="test_quantizelinear_e4m3fn",
)
```

</details>


<details>
<summary>e5m2</summary>

```python
node = onnx.helper.make_node(
    "QuantizeLinear",
    inputs=["x", "y_scale", "y_zero_point"],
    outputs=["y"],
)

x = np.array([0.0, 1.0, 2.0, 100000.0, 200.0]).astype(np.float32)
y_scale = np.float32(2)
y_zero_point = make_tensor("y_zero_point", TensorProto.FLOAT8E5M2, [1], [0.0])
y = make_tensor("y", TensorProto.FLOAT8E5M2, [5], [0, 0.5, 1, 49152, 96])

expect(
    node,
    inputs=[x, y_scale, y_zero_point],
    outputs=[y],
    name="test_quantizelinear_e5m2",
)
```

</details>


<details>
<summary>int16</summary>

```python
node = onnx.helper.make_node(
    "QuantizeLinear",
    inputs=["x", "y_scale", "y_zero_point"],
    outputs=["y"],
)

x = np.array(
    [
        0.0,
        -514.0,
        3.0,
        -3.0,
        2.9,
        -2.9,
        3.1,
        -3.1,
        65022.0,
        -66046.0,
        65023.0,
        -66047.0,
        65024.0,
        -66048.0,
        70000.0,
        -70000.0,
    ]
).astype(np.float32)
y_scale = np.float32(2.0)
y_zero_point = np.int16(256)
y = np.array(
    [
        256,
        -1,
        258,
        254,
        257,
        255,
        258,
        254,
        32767,
        -32767,
        32767,
        -32768,
        32767,
        -32768,
        32767,
        -32768,
    ]
).astype(np.int16)

expect(
    node,
    inputs=[x, y_scale, y_zero_point],
    outputs=[y],
    name="test_quantizelinear_int16",
)
```

</details>


<details>
<summary>int4</summary>

```python
node = onnx.helper.make_node(
    "QuantizeLinear",
    inputs=["x", "y_scale", "y_zero_point"],
    outputs=["y"],
    axis=0,
)

x = np.array(
    [
        [0.0, 2.5, 4.8, 8.6],
        [-30, -20, 6, 9],
        [12, 15, 16, 40],
    ]
).astype(np.float32)

y_scale = np.asarray([2.0, 3.0, 4.0], dtype=np.float32)
y_zero_point = make_tensor(
    "y_zero_point", TensorProto.INT4, y_scale.shape, np.ones_like(y_scale)
)
y = make_tensor(
    "y", TensorProto.INT4, x.shape, [1, 2, 3, 5, -8, -6, 3, 4, 4, 5, 5, 7]
)

expect(
    node,
    inputs=[x, y_scale, y_zero_point],
    outputs=[y],
    name="test_quantizelinear_int4",
)
```

</details>


<details>
<summary>quantizelinear</summary>

```python
node = onnx.helper.make_node(
    "QuantizeLinear",
    inputs=["x", "y_scale", "y_zero_point"],
    outputs=["y"],
)

x = np.array([0, 2, 3, 1000, -254, -1000]).astype(np.float32)
y_scale = np.float32(2)
y_zero_point = np.uint8(128)
y = np.array([128, 129, 130, 255, 1, 0]).astype(np.uint8)

expect(
    node,
    inputs=[x, y_scale, y_zero_point],
    outputs=[y],
    name="test_quantizelinear",
)
```

</details>


<details>
<summary>uint16</summary>

```python
node = onnx.helper.make_node(
    "QuantizeLinear",
    inputs=["x", "y_scale", "y_zero_point"],
    outputs=["y"],
)

x = np.array(
    [
        0.0,
        -128.0,
        3.0,
        -3.0,
        2.9,
        -2.9,
        3.1,
        -3.1,
        65536.0,
        -65534.0,
        70000.0,
        -70000.0,
    ]
).astype(np.float32)
y_scale = np.float32(2.0)
y_zero_point = np.uint16(32767)
y = np.array(
    [
        32767,
        32703,
        32769,
        32765,
        32768,
        32766,
        32769,
        32765,
        65535,
        0,
        65535,
        0,
    ]
).astype(np.uint16)

expect(
    node,
    inputs=[x, y_scale, y_zero_point],
    outputs=[y],
    name="test_quantizelinear_uint16",
)
```

</details>


<details>
<summary>uint4</summary>

```python
node = onnx.helper.make_node(
    "QuantizeLinear",
    inputs=["x", "y_scale", "y_zero_point"],
    outputs=["y"],
    axis=0,
)

x = np.array(
    [
        [0.0, 2.5, 4.8, 8.6],
        [-30, -20, 6, 9],
        [12, 15, 16, 40],
    ]
).astype(np.float32)

y_scale = np.asarray([2.0, 3.0, 4.0], dtype=np.float32)
y_zero_point = make_tensor(
    "y_zero_point", TensorProto.UINT4, y_scale.shape, np.ones_like(y_scale)
)
y = make_tensor(
    "y", TensorProto.UINT4, x.shape, [1, 2, 3, 5, -1, -1, 3, 4, 4, 5, 5, 11]
)

expect(
    node,
    inputs=[x, y_scale, y_zero_point],
    outputs=[y],
    name="test_quantizelinear_uint4",
)
```

</details>


### <a name="RNN"></a><a name="rnn">**RNN**</a>

  Computes an one-layer simple RNN. This operator is usually supported
  via some custom implementation such as CuDNN.

  Notations:

  * `X` - input tensor
  * `i` - input gate
  * `t` - time step (t-1 means previous time step)
  * `Wi` - W parameter weight matrix for input gate
  * `Ri` - R recurrence weight matrix for input gate
  * `Wbi` - W parameter bias vector for input gate
  * `Rbi` - R parameter bias vector for input gate
  * `WBi` - W parameter weight matrix for backward input gate
  * `RBi` - R recurrence weight matrix for backward input gate
  * `WBbi` - WR bias vectors for backward input gate
  * `RBbi` - RR bias vectors for backward input gate
  * `H` - Hidden state
  * `num_directions` - 2 if direction == bidirectional else 1

  Activation functions:

  * Relu(x)                - max(0, x)
  * Tanh(x)                - (1 - e^{-2x})/(1 + e^{-2x})
  * Sigmoid(x)             - 1/(1 + e^{-x})

  NOTE: Below are optional

  * Affine(x)              - alpha*x + beta
  * LeakyRelu(x)           - x if x >= 0 else alpha * x
  * ThresholdedRelu(x)     - x if x >= alpha else 0
  * ScaledTanh(x)          - alpha*Tanh(beta*x)
  * HardSigmoid(x)         - min(max(alpha*x + beta, 0), 1)
  * Elu(x)                 - x if x >= 0 else alpha*(e^x - 1)
  * Softsign(x)            - x/(1 + |x|)
  * Softplus(x)            - log(1 + e^x)

  Equations (Default: f=Tanh):

  * Ht = f(Xt*(Wi^T) + Ht-1*(Ri^T) + Wbi + Rbi)
  This operator has **optional** inputs/outputs. See [the doc](IR.md) for more details about the representation of optional arguments. An empty string may be used in the place of an actual argument's name to indicate a missing argument. Trailing optional arguments (those not followed by an argument that is present) may also be simply omitted.

#### Version

This version of the operator has been available since version 14 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#RNN-1">1</a>, <a href="Changelog.md#RNN-7">7</a>

#### Attributes

<dl>
<dt><tt>activation_alpha</tt> : list of floats</dt>
<dd>Optional scaling values used by some activation functions. The values are consumed in the order of activation functions, for example (f, g, h) in LSTM. Default values are the same as of corresponding ONNX operators.For example with LeakyRelu, the default alpha is 0.01.</dd>
<dt><tt>activation_beta</tt> : list of floats</dt>
<dd>Optional scaling values used by some activation functions. The values are consumed in the order of activation functions, for example (f, g, h) in LSTM. Default values are the same as of corresponding ONNX operators.</dd>
<dt><tt>activations</tt> : list of strings (default is ['Tanh', 'Tanh'])</dt>
<dd>One (or two if bidirectional) activation function for input gate. The activation function must be one of the activation functions specified above. Optional: Default `Tanh` if not specified.</dd>
<dt><tt>clip</tt> : float</dt>
<dd>Cell clip threshold. Clipping bounds the elements of a tensor in the range of [-threshold, +threshold] and is applied to the input of activations. No clip if not specified.</dd>
<dt><tt>direction</tt> : string (default is forward)</dt>
<dd>Specify if the RNN is forward, reverse, or bidirectional. Must be one of forward (default), reverse, or bidirectional.</dd>
<dt><tt>hidden_size</tt> : int</dt>
<dd>Number of neurons in the hidden layer</dd>
<dt><tt>layout</tt> : int (default is 0)</dt>
<dd>The shape format of inputs X, initial_h and outputs Y, Y_h. If 0, the following shapes are expected: X.shape = [seq_length, batch_size, input_size], Y.shape = [seq_length, num_directions, batch_size, hidden_size], initial_h.shape = Y_h.shape = [num_directions, batch_size, hidden_size]. If 1, the following shapes are expected: X.shape = [batch_size, seq_length, input_size], Y.shape = [batch_size, seq_length, num_directions, hidden_size], initial_h.shape = Y_h.shape = [batch_size, num_directions, hidden_size].</dd>
</dl>

#### Inputs (3 - 6)

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>The input sequences packed (and potentially padded) into one 3-D tensor with the shape of `[seq_length, batch_size, input_size]`.</dd>
<dt><tt>W</tt> (differentiable) : T</dt>
<dd>The weight tensor for input gate. Concatenation of `Wi` and `WBi` (if bidirectional). The tensor has shape `[num_directions, hidden_size, input_size]`.</dd>
<dt><tt>R</tt> (differentiable) : T</dt>
<dd>The recurrence weight tensor. Concatenation of `Ri` and `RBi` (if bidirectional). The tensor has shape `[num_directions, hidden_size, hidden_size]`.</dd>
<dt><tt>B</tt> (optional, differentiable) : T</dt>
<dd>The bias tensor for input gate. Concatenation of `[Wbi, Rbi]` and `[WBbi, RBbi]` (if bidirectional). The tensor has shape `[num_directions, 2*hidden_size]`. Optional: If not specified - assumed to be 0.</dd>
<dt><tt>sequence_lens</tt> (optional, non-differentiable) : T1</dt>
<dd>Optional tensor specifying lengths of the sequences in a batch. If not specified - assumed all sequences in the batch to have length `seq_length`. It has shape `[batch_size]`.</dd>
<dt><tt>initial_h</tt> (optional, non-differentiable) : T</dt>
<dd>Optional initial value of the hidden. If not specified - assumed to be 0. It has shape `[num_directions, batch_size, hidden_size]`.</dd>
</dl>

#### Outputs (0 - 2)

<dl>
<dt><tt>Y</tt> (optional, differentiable) : T</dt>
<dd>A tensor that concats all the intermediate output values of the hidden. It has shape `[seq_length, num_directions, batch_size, hidden_size]`. </dd>
<dt><tt>Y_h</tt> (optional, differentiable) : T</dt>
<dd>The last output value of the hidden. It has shape `[num_directions, batch_size, hidden_size]`.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
<dt><tt>T1</tt> : tensor(int32)</dt>
<dd>Constrain seq_lens to integer tensor.</dd>
</dl>


#### Examples

<details>
<summary>batchwise</summary>

```python
input = np.array([[[1.0, 2.0]], [[3.0, 4.0]], [[5.0, 6.0]]]).astype(np.float32)

input_size = 2
hidden_size = 4
weight_scale = 0.5
layout = 1

node = onnx.helper.make_node(
    "RNN",
    inputs=["X", "W", "R"],
    outputs=["Y", "Y_h"],
    hidden_size=hidden_size,
    layout=layout,
)

W = weight_scale * np.ones((1, hidden_size, input_size)).astype(np.float32)
R = weight_scale * np.ones((1, hidden_size, hidden_size)).astype(np.float32)

rnn = RNNHelper(X=input, W=W, R=R, layout=layout)
Y, Y_h = rnn.step()
expect(
    node,
    inputs=[input, W, R],
    outputs=[Y.astype(np.float32), Y_h.astype(np.float32)],
    name="test_simple_rnn_batchwise",
)
```

</details>


<details>
<summary>defaults</summary>

```python
input = np.array([[[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]]]).astype(np.float32)

input_size = 2
hidden_size = 4
weight_scale = 0.1

node = onnx.helper.make_node(
    "RNN", inputs=["X", "W", "R"], outputs=["", "Y_h"], hidden_size=hidden_size
)

W = weight_scale * np.ones((1, hidden_size, input_size)).astype(np.float32)
R = weight_scale * np.ones((1, hidden_size, hidden_size)).astype(np.float32)

rnn = RNNHelper(X=input, W=W, R=R)
_, Y_h = rnn.step()
expect(
    node,
    inputs=[input, W, R],
    outputs=[Y_h.astype(np.float32)],
    name="test_simple_rnn_defaults",
)
```

</details>


<details>
<summary>initial_bias</summary>

```python
input = np.array([[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0], [7.0, 8.0, 9.0]]]).astype(
    np.float32
)

input_size = 3
hidden_size = 5
custom_bias = 0.1
weight_scale = 0.1

node = onnx.helper.make_node(
    "RNN",
    inputs=["X", "W", "R", "B"],
    outputs=["", "Y_h"],
    hidden_size=hidden_size,
)

W = weight_scale * np.ones((1, hidden_size, input_size)).astype(np.float32)
R = weight_scale * np.ones((1, hidden_size, hidden_size)).astype(np.float32)

# Adding custom bias
W_B = custom_bias * np.ones((1, hidden_size)).astype(np.float32)
R_B = np.zeros((1, hidden_size)).astype(np.float32)
B = np.concatenate((W_B, R_B), axis=1)

rnn = RNNHelper(X=input, W=W, R=R, B=B)
_, Y_h = rnn.step()
expect(
    node,
    inputs=[input, W, R, B],
    outputs=[Y_h.astype(np.float32)],
    name="test_simple_rnn_with_initial_bias",
)
```

</details>


<details>
<summary>seq_length</summary>

```python
input = np.array(
    [
        [[1.0, 2.0, 3.0], [4.0, 5.0, 6.0], [7.0, 8.0, 9.0]],
        [[10.0, 11.0, 12.0], [13.0, 14.0, 15.0], [16.0, 17.0, 18.0]],
    ]
).astype(np.float32)

input_size = 3
hidden_size = 5

node = onnx.helper.make_node(
    "RNN",
    inputs=["X", "W", "R", "B"],
    outputs=["", "Y_h"],
    hidden_size=hidden_size,
)

W = np.random.randn(1, hidden_size, input_size).astype(np.float32)
R = np.random.randn(1, hidden_size, hidden_size).astype(np.float32)

# Adding custom bias
W_B = np.random.randn(1, hidden_size).astype(np.float32)
R_B = np.random.randn(1, hidden_size).astype(np.float32)
B = np.concatenate((W_B, R_B), axis=1)

rnn = RNNHelper(X=input, W=W, R=R, B=B)
_, Y_h = rnn.step()
expect(
    node,
    inputs=[input, W, R, B],
    outputs=[Y_h.astype(np.float32)],
    name="test_rnn_seq_length",
)
```

</details>


### <a name="RandomNormal"></a><a name="randomnormal">**RandomNormal**</a>

  Generate a tensor with random values drawn from a normal distribution. The shape
  of the tensor is specified by the `shape` argument and the parameter of the normal distribution
  specified by `mean` and `scale`.

  The data type is specified by the 'dtype' argument. The 'dtype' argument must
  be one of the data types specified in the 'DataType' enum field in the
  TensorProto message.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>dtype</tt> : int (default is 1)</dt>
<dd>The data type for the elements of the output tensor. Default is TensorProto::FLOAT.</dd>
<dt><tt>mean</tt> : float (default is 0.0)</dt>
<dd>The mean of the normal distribution.</dd>
<dt><tt>scale</tt> : float (default is 1.0)</dt>
<dd>The standard deviation of the normal distribution.</dd>
<dt><tt>seed</tt> : float</dt>
<dd>(Optional) Seed to the random generator, if not specified we will auto generate one.</dd>
<dt><tt>shape</tt> : list of ints (required)</dt>
<dd>The shape of the output tensor.</dd>
</dl>

#### Inputs


#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>Output tensor of random values drawn from normal distribution</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain output types to float tensors.</dd>
</dl>


### <a name="RandomNormalLike"></a><a name="randomnormallike">**RandomNormalLike**</a>

  Generate a tensor with random values drawn from a normal distribution.
  The shape of the output tensor is copied from the shape of the input tensor,
  and the parameters of the normal distribution are specified by `mean` and `scale`.

  The data type is specified by the 'dtype' argument, or copied from the input tensor if not provided.
  The 'dtype' argument must be one of the data types specified in the 'DataType' enum field in the
  TensorProto message, and be valid as an output type.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>dtype</tt> : int</dt>
<dd>(Optional) The data type for the elements of the output tensor, if not specified, we will use the data type of the input tensor.</dd>
<dt><tt>mean</tt> : float (default is 0.0)</dt>
<dd>The mean of the normal distribution.</dd>
<dt><tt>scale</tt> : float (default is 1.0)</dt>
<dd>The standard deviation of the normal distribution.</dd>
<dt><tt>seed</tt> : float</dt>
<dd>(Optional) Seed to the random generator, if not specified we will auto generate one.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T1</dt>
<dd>Input tensor to copy shape and optionally type information from.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T2</dt>
<dd>Output tensor of random values drawn from normal distribution</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain to any tensor type. If the dtype attribute is not provided this must be a valid output type.</dd>
<dt><tt>T2</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain output types to float tensors.</dd>
</dl>


### <a name="RandomUniform"></a><a name="randomuniform">**RandomUniform**</a>

  Generate a tensor with random values drawn from a uniform distribution. The shape
  of the tensor is specified by the `shape` argument and the range by `low` and `high`.

  The data type is specified by the 'dtype' argument. The 'dtype' argument must
  be one of the data types specified in the 'DataType' enum field in the
  TensorProto message.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>dtype</tt> : int (default is 1)</dt>
<dd>The data type for the elements of the output tensor. If not specified, default is TensorProto::FLOAT.</dd>
<dt><tt>high</tt> : float (default is 1.0)</dt>
<dd>Upper boundary of the output values.</dd>
<dt><tt>low</tt> : float (default is 0.0)</dt>
<dd>Lower boundary of the output values.</dd>
<dt><tt>seed</tt> : float</dt>
<dd>(Optional) Seed to the random generator, if not specified we will auto generate one.</dd>
<dt><tt>shape</tt> : list of ints (required)</dt>
<dd>The shape of the output tensor.</dd>
</dl>

#### Inputs


#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>Output tensor of random values drawn from uniform distribution</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain output types to float tensors.</dd>
</dl>


### <a name="RandomUniformLike"></a><a name="randomuniformlike">**RandomUniformLike**</a>

  Generate a tensor with random values drawn from a uniform distribution.
  The shape of the output tensor is copied from the shape of the input tensor,
  and the parameters of the uniform distribution are specified by `low` and `high`.

  The data type is specified by the 'dtype' argument, or copied from the input tensor if not provided.
  The 'dtype' argument must be one of the data types specified in the 'DataType' enum field in the
  TensorProto message and be valid as an output type.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>dtype</tt> : int</dt>
<dd>(Optional) The data type for the elements of the output tensor, if not specified, we will use the data type of the input tensor.</dd>
<dt><tt>high</tt> : float (default is 1.0)</dt>
<dd>Upper boundary of the output values.</dd>
<dt><tt>low</tt> : float (default is 0.0)</dt>
<dd>Lower boundary of the output values.</dd>
<dt><tt>seed</tt> : float</dt>
<dd>(Optional) Seed to the random generator, if not specified we will auto generate one.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T1</dt>
<dd>Input tensor to copy shape and optionally type information from.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T2</dt>
<dd>Output tensor of random values drawn from uniform distribution</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain to any tensor type. If the dtype attribute is not provided this must be a valid output type.</dd>
<dt><tt>T2</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain output types to float tensors.</dd>
</dl>


### <a name="Range"></a><a name="range">**Range**</a>

  Generate a tensor containing a sequence of numbers that begin at `start` and extends by increments of `delta`
  up to `limit` (exclusive).

  The number of elements in the output of range is computed as below:

  ```
  number_of_elements = max( ceil( (limit - start) / delta ) , 0 )
  ```

  The pseudocode determining the contents of the output is shown below:

  ```
  for(int i=0; i<number_of_elements; ++i) {
    output[i] =  start + (i * delta);
  }
  ```

  Example 1

  ```
  Inputs: start = 3, limit = 9, delta = 3
  Output: [3, 6]
  ```

  Example 2

  ```
  Inputs: start = 10, limit = 4, delta = -2
  Output: [10, 8, 6]
  ```

#### Version

This version of the operator has been available since version 11 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>start</tt> : T</dt>
<dd>Scalar. First entry for the range of output values.</dd>
<dt><tt>limit</tt> : T</dt>
<dd>Scalar. Exclusive upper limit for the range of output values.</dd>
<dt><tt>delta</tt> : T</dt>
<dd>Scalar. Value to step by.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> : T</dt>
<dd>A 1-D tensor with same type as the inputs containing generated range of values.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float), tensor(double), tensor(int16), tensor(int32), tensor(int64)</dt>
<dd>Constrain input types to common numeric type tensors.</dd>
</dl>


#### Examples

<details>
<summary>range_float_type_positive_delta</summary>

```python
node = onnx.helper.make_node(
    "Range",
    inputs=["start", "limit", "delta"],
    outputs=["output"],
)

start = np.float32(1)
limit = np.float32(5)
delta = np.float32(2)

output = np.arange(
    start, limit, delta, dtype=np.float32
)  # expected output [1.0, 3.0]
expect(
    node,
    inputs=[start, limit, delta],
    outputs=[output],
    name="test_range_float_type_positive_delta",
)
```

</details>


<details>
<summary>range_int32_type_negative_delta</summary>

```python
node = onnx.helper.make_node(
    "Range",
    inputs=["start", "limit", "delta"],
    outputs=["output"],
)

start = np.int32(10)
limit = np.int32(6)
delta = np.int32(-3)

output = np.arange(
    start, limit, delta, dtype=np.int32
)  # expected output [10, 7]
expect(
    node,
    inputs=[start, limit, delta],
    outputs=[output],
    name="test_range_int32_type_negative_delta",
)
```

</details>


### <a name="Reciprocal"></a><a name="reciprocal">**Reciprocal**</a>

  Reciprocal takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the reciprocal is, y = 1/x, is applied to
  the tensor elementwise.

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Reciprocal-1">1</a>, <a href="Changelog.md#Reciprocal-6">6</a>

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>reciprocal</summary>

```python
node = onnx.helper.make_node(
    "Reciprocal",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([-4, 2]).astype(np.float32)
y = np.reciprocal(x)  # expected output [-0.25, 0.5],
expect(node, inputs=[x], outputs=[y], name="test_reciprocal_example")

x = np.random.rand(3, 4, 5).astype(np.float32) + 0.5
y = np.reciprocal(x)
expect(node, inputs=[x], outputs=[y], name="test_reciprocal")
```

</details>


### <a name="ReduceL1"></a><a name="reducel1">**ReduceL1**</a>

  Computes the L1 norm of the input tensor's elements along the provided axes. The resulting
  tensor has the same rank as the input if `keepdims` equals 1. If `keepdims` equals 0, then
  the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are
  valid. Reduction over an empty set of values yields 0.


  The above behavior is similar to numpy, with the exception that numpy defaults `keepdims`
  to `False` instead of `True`.

#### Version

This version of the operator has been available since version 18 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#ReduceL1-1">1</a>, <a href="Changelog.md#ReduceL1-11">11</a>, <a href="Changelog.md#ReduceL1-13">13</a>

#### Attributes

<dl>
<dt><tt>keepdims</tt> : int (default is 1)</dt>
<dd>Keep the reduced dimension or not, default 1 means keep reduced dimension.</dd>
<dt><tt>noop_with_empty_axes</tt> : int (default is 0)</dt>
<dd>Defines behavior if 'axes' is empty. Default behavior with 'false' is to reduce all axes. When axes is empty and this attribute is set to true, input tensor will not be reduced,and the output tensor would be equivalent to input tensor.</dd>
</dl>

#### Inputs (1 - 2)

<dl>
<dt><tt>data</tt> (differentiable) : T</dt>
<dd>An input tensor.</dd>
<dt><tt>axes</tt> (optional, non-differentiable) : tensor(int64)</dt>
<dd>Optional input list of integers, along which to reduce. The default is to reduce over all the dimensions of the input tensor if 'noop_with_empty_axes' is false, else act as an Identity op when 'noop_with_empty_axes' is true. Accepted range is [-r, r-1] where r = rank(data).</dd>
</dl>

#### Outputs

<dl>
<dt><tt>reduced</tt> (differentiable) : T</dt>
<dd>Reduced output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to numeric tensors.</dd>
</dl>


#### Examples

<details>
<summary>default_axes_keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([], dtype=np.int64)
keepdims = 1

node = onnx.helper.make_node(
    "ReduceL1", inputs=["data", "axes"], outputs=["reduced"], keepdims=keepdims
)

data = np.reshape(np.arange(1, np.prod(shape) + 1, dtype=np.float32), shape)
# print(data)
# [[[1., 2.], [3., 4.]], [[5., 6.], [7., 8.]], [[9., 10.], [11., 12.]]]

reduced = np.sum(a=np.abs(data), axis=None, keepdims=keepdims == 1)
# print(reduced)
# [[[78.]]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_l1_default_axes_keepdims_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sum(a=np.abs(data), axis=None, keepdims=keepdims == 1)

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_l1_default_axes_keepdims_random",
)
```

</details>


<details>
<summary>do_not_keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([2], dtype=np.int64)
keepdims = 0

node = onnx.helper.make_node(
    "ReduceL1",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.reshape(np.arange(1, np.prod(shape) + 1, dtype=np.float32), shape)
# print(data)
# [[[1., 2.], [3., 4.]], [[5., 6.], [7., 8.]], [[9., 10.], [11., 12.]]]

reduced = np.sum(a=np.abs(data), axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[3., 7.], [11., 15.], [19., 23.]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_l1_do_not_keepdims_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sum(a=np.abs(data), axis=tuple(axes), keepdims=keepdims == 1)

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_l1_do_not_keepdims_random",
)
```

</details>


<details>
<summary>empty_set</summary>

```python
shape = [2, 0, 4]
keepdims = 1
reduced_shape = [2, 1, 4]

node = onnx.helper.make_node(
    "ReduceL1",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array([], dtype=np.float32).reshape(shape)
axes = np.array([1], dtype=np.int64)
reduced = np.array(np.zeros(reduced_shape, dtype=np.float32))

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_l1_empty_set",
)
```

</details>


<details>
<summary>keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([2], dtype=np.int64)
keepdims = 1

node = onnx.helper.make_node(
    "ReduceL1",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.reshape(np.arange(1, np.prod(shape) + 1, dtype=np.float32), shape)
# print(data)
# [[[1., 2.], [3., 4.]], [[5., 6.], [7., 8.]], [[9., 10.], [11., 12.]]]

reduced = np.sum(a=np.abs(data), axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[[3.], [7.]], [[11.], [15.]], [[19.], [23.]]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_l1_keep_dims_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sum(a=np.abs(data), axis=tuple(axes), keepdims=keepdims == 1)

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_l1_keep_dims_random",
)
```

</details>


<details>
<summary>negative_axes_keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([-1], dtype=np.int64)
keepdims = 1

node = onnx.helper.make_node(
    "ReduceL1",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.reshape(np.arange(1, np.prod(shape) + 1, dtype=np.float32), shape)
# print(data)
# [[[1., 2.], [3., 4.]], [[5., 6.], [7., 8.]], [[9., 10.], [11., 12.]]]

reduced = np.sum(a=np.abs(data), axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[[3.], [7.]], [[11.], [15.]], [[19.], [23.]]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_l1_negative_axes_keep_dims_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sum(a=np.abs(data), axis=tuple(axes), keepdims=keepdims == 1)

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_l1_negative_axes_keep_dims_random",
)
```

</details>


### <a name="ReduceL2"></a><a name="reducel2">**ReduceL2**</a>

  Computes the L2 norm of the input tensor's elements along the provided axes. The resulting
  tensor has the same rank as the input if `keepdims` equals 1. If `keepdims` equals 0, then
  the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are
  valid. Reduction over an empty set of values yields 0.


  The above behavior is similar to numpy, with the exception that numpy defaults `keepdims`
  to `False` instead of `True`.

#### Version

This version of the operator has been available since version 18 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#ReduceL2-1">1</a>, <a href="Changelog.md#ReduceL2-11">11</a>, <a href="Changelog.md#ReduceL2-13">13</a>

#### Attributes

<dl>
<dt><tt>keepdims</tt> : int (default is 1)</dt>
<dd>Keep the reduced dimension or not, default 1 means keep reduced dimension.</dd>
<dt><tt>noop_with_empty_axes</tt> : int (default is 0)</dt>
<dd>Defines behavior if 'axes' is empty. Default behavior with 'false' is to reduce all axes. When axes is empty and this attribute is set to true, input tensor will not be reduced,and the output tensor would be equivalent to input tensor.</dd>
</dl>

#### Inputs (1 - 2)

<dl>
<dt><tt>data</tt> (differentiable) : T</dt>
<dd>An input tensor.</dd>
<dt><tt>axes</tt> (optional, non-differentiable) : tensor(int64)</dt>
<dd>Optional input list of integers, along which to reduce. The default is to reduce over all the dimensions of the input tensor if 'noop_with_empty_axes' is false, else act as an Identity op when 'noop_with_empty_axes' is true. Accepted range is [-r, r-1] where r = rank(data).</dd>
</dl>

#### Outputs

<dl>
<dt><tt>reduced</tt> (differentiable) : T</dt>
<dd>Reduced output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to numeric tensors.</dd>
</dl>


#### Examples

<details>
<summary>default_axes_keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([], dtype=np.int64)
keepdims = 1

node = onnx.helper.make_node(
    "ReduceL2", inputs=["data", "axes"], outputs=["reduced"], keepdims=keepdims
)

data = np.reshape(np.arange(1, np.prod(shape) + 1, dtype=np.float32), shape)
# print(data)
# [[[1., 2.], [3., 4.]], [[5., 6.], [7., 8.]], [[9., 10.], [11., 12.]]]

reduced = np.sqrt(np.sum(a=np.square(data), axis=None, keepdims=keepdims == 1))
# print(reduced)
# [[[25.49509757]]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_l2_default_axes_keepdims_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sqrt(np.sum(a=np.square(data), axis=None, keepdims=keepdims == 1))

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_l2_default_axes_keepdims_random",
)
```

</details>


<details>
<summary>do_not_keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([2], dtype=np.int64)
keepdims = 0

node = onnx.helper.make_node(
    "ReduceL2",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.reshape(np.arange(1, np.prod(shape) + 1, dtype=np.float32), shape)
# print(data)
# [[[1., 2.], [3., 4.]], [[5., 6.], [7., 8.]], [[9., 10.], [11., 12.]]]

reduced = np.sqrt(
    np.sum(a=np.square(data), axis=tuple(axes), keepdims=keepdims == 1)
)
# print(reduced)
# [[2.23606798, 5.],
# [7.81024968, 10.63014581],
# [13.45362405, 16.2788206]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_l2_do_not_keepdims_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sqrt(
    np.sum(a=np.square(data), axis=tuple(axes), keepdims=keepdims == 1)
)

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_l2_do_not_keepdims_random",
)
```

</details>


<details>
<summary>empty_set</summary>

```python
shape = [2, 0, 4]
keepdims = 1
reduced_shape = [2, 1, 4]

node = onnx.helper.make_node(
    "ReduceL2",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array([], dtype=np.float32).reshape(shape)
axes = np.array([1], dtype=np.int64)
reduced = np.array(np.zeros(reduced_shape, dtype=np.float32))

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_l2_empty_set",
)
```

</details>


<details>
<summary>keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([2], dtype=np.int64)
keepdims = 1

node = onnx.helper.make_node(
    "ReduceL2",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.reshape(np.arange(1, np.prod(shape) + 1, dtype=np.float32), shape)
# print(data)
# [[[1., 2.], [3., 4.]], [[5., 6.], [7., 8.]], [[9., 10.], [11., 12.]]]

reduced = np.sqrt(
    np.sum(a=np.square(data), axis=tuple(axes), keepdims=keepdims == 1)
)
# print(reduced)
# [[[2.23606798], [5.]]
# [[7.81024968], [10.63014581]]
# [[13.45362405], [16.2788206 ]]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_l2_keep_dims_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sqrt(
    np.sum(a=np.square(data), axis=tuple(axes), keepdims=keepdims == 1)
)

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_l2_keep_dims_random",
)
```

</details>


<details>
<summary>negative_axes_keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([-1], dtype=np.int64)
keepdims = 1

node = onnx.helper.make_node(
    "ReduceL2",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.reshape(np.arange(1, np.prod(shape) + 1, dtype=np.float32), shape)
# print(data)
# [[[1., 2.], [3., 4.]], [[5., 6.], [7., 8.]], [[9., 10.], [11., 12.]]]

reduced = np.sqrt(
    np.sum(a=np.square(data), axis=tuple(axes), keepdims=keepdims == 1)
)
# print(reduced)
# [[[2.23606798], [5.]]
# [[7.81024968], [10.63014581]]
# [[13.45362405], [16.2788206 ]]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_l2_negative_axes_keep_dims_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sqrt(
    np.sum(a=np.square(data), axis=tuple(axes), keepdims=keepdims == 1)
)

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_l2_negative_axes_keep_dims_random",
)
```

</details>


### <a name="ReduceLogSum"></a><a name="reducelogsum">**ReduceLogSum**</a>

  Computes the log sum of the input tensor's elements along the provided axes. The resulting
  tensor has the same rank as the input if `keepdims` equals 1. If `keepdims` equals 0, then
  the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are
  valid. Reduction over an empty set of values yields minus infinity (if supported by the datatype) or undefined otherwise.


  The above behavior is similar to numpy, with the exception that numpy defaults `keepdims`
  to `False` instead of `True`.

#### Version

This version of the operator has been available since version 18 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#ReduceLogSum-1">1</a>, <a href="Changelog.md#ReduceLogSum-11">11</a>, <a href="Changelog.md#ReduceLogSum-13">13</a>

#### Attributes

<dl>
<dt><tt>keepdims</tt> : int (default is 1)</dt>
<dd>Keep the reduced dimension or not, default 1 means keep reduced dimension.</dd>
<dt><tt>noop_with_empty_axes</tt> : int (default is 0)</dt>
<dd>Defines behavior if 'axes' is empty. Default behavior with 'false' is to reduce all axes. When axes is empty and this attribute is set to true, input tensor will not be reduced,and the output tensor would be equivalent to input tensor.</dd>
</dl>

#### Inputs (1 - 2)

<dl>
<dt><tt>data</tt> (differentiable) : T</dt>
<dd>An input tensor.</dd>
<dt><tt>axes</tt> (optional, non-differentiable) : tensor(int64)</dt>
<dd>Optional input list of integers, along which to reduce. The default is to reduce over all the dimensions of the input tensor if 'noop_with_empty_axes' is false, else act as an Identity op when 'noop_with_empty_axes' is true. Accepted range is [-r, r-1] where r = rank(data).</dd>
</dl>

#### Outputs

<dl>
<dt><tt>reduced</tt> (differentiable) : T</dt>
<dd>Reduced output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to numeric tensors.</dd>
</dl>


#### Examples

<details>
<summary>empty_set</summary>

```python
shape = [2, 0, 4]
keepdims = 1
reduced_shape = [2, 1, 4]

node = onnx.helper.make_node(
    "ReduceLogSum",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array([], dtype=np.float32).reshape(shape)
axes = np.array([1], dtype=np.int64)
zero = np.array(np.zeros(reduced_shape, dtype=np.float32))
reduced = np.log(zero)  # -inf

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_log_sum_empty_set",
)
```

</details>


<details>
<summary>keepdims</summary>

```python
node = onnx.helper.make_node(
    "ReduceLogSum", inputs=["data", "axes"], outputs=["reduced"]
)
data = np.random.ranf([3, 4, 5]).astype(np.float32)
reduced = np.log(np.sum(data, keepdims=True))
axes = np.array([], dtype=np.int64)
expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_log_sum_default",
)
```

</details>


<details>
<summary>negative_axes_keepdims</summary>

```python
axes = np.array([-2], dtype=np.int64)
node = onnx.helper.make_node(
    "ReduceLogSum", inputs=["data", "axes"], outputs=["reduced"]
)
data = np.random.ranf([3, 4, 5]).astype(np.float32)
reduced = np.log(np.sum(data, axis=tuple(axes), keepdims=True))
# print(reduced)
expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_log_sum_negative_axes",
)
```

</details>


<details>
<summary>nokeepdims</summary>

```python
shape = [3, 4, 5]
axes = np.array([2, 1], dtype=np.int64)

node = onnx.helper.make_node(
    "ReduceLogSum",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=0,
)
data = np.random.ranf(shape).astype(np.float32)
reduced = np.log(np.sum(data, axis=tuple(axes), keepdims=False))
expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_log_sum_desc_axes",
)

axes = np.array([0, 1], dtype=np.int64)
node = onnx.helper.make_node(
    "ReduceLogSum",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=0,
)
data = np.random.ranf(shape).astype(np.float32)
reduced = np.log(np.sum(data, axis=tuple(axes), keepdims=False))
expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_log_sum_asc_axes",
)
```

</details>


### <a name="ReduceLogSumExp"></a><a name="reducelogsumexp">**ReduceLogSumExp**</a>

  Computes the log sum exponent of the input tensor's elements along the provided axes. The resulting
  tensor has the same rank as the input if `keepdims` equals 1. If `keepdims` equals 0, then
  the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are
  valid. Reduction over an empty set of values yields minus infinity (if supported by the datatype) or undefined otherwise.


  The above behavior is similar to numpy, with the exception that numpy defaults `keepdims`
  to `False` instead of `True`.

#### Version

This version of the operator has been available since version 18 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#ReduceLogSumExp-1">1</a>, <a href="Changelog.md#ReduceLogSumExp-11">11</a>, <a href="Changelog.md#ReduceLogSumExp-13">13</a>

#### Attributes

<dl>
<dt><tt>keepdims</tt> : int (default is 1)</dt>
<dd>Keep the reduced dimension or not, default 1 means keep reduced dimension.</dd>
<dt><tt>noop_with_empty_axes</tt> : int (default is 0)</dt>
<dd>Defines behavior if 'axes' is empty. Default behavior with 'false' is to reduce all axes. When axes is empty and this attribute is set to true, input tensor will not be reduced,and the output tensor would be equivalent to input tensor.</dd>
</dl>

#### Inputs (1 - 2)

<dl>
<dt><tt>data</tt> (differentiable) : T</dt>
<dd>An input tensor.</dd>
<dt><tt>axes</tt> (optional, non-differentiable) : tensor(int64)</dt>
<dd>Optional input list of integers, along which to reduce. The default is to reduce over all the dimensions of the input tensor if 'noop_with_empty_axes' is false, else act as an Identity op when 'noop_with_empty_axes' is true. Accepted range is [-r, r-1] where r = rank(data).</dd>
</dl>

#### Outputs

<dl>
<dt><tt>reduced</tt> (differentiable) : T</dt>
<dd>Reduced output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to numeric tensors.</dd>
</dl>


#### Examples

<details>
<summary>default_axes_keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([], dtype=np.int64)
keepdims = 1

node = onnx.helper.make_node(
    "ReduceLogSumExp",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array(
    [[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]], dtype=np.double
)
reduced = np.log(np.sum(np.exp(data), axis=None, keepdims=keepdims == 1))
# print(reduced)
# [[[60.00671387]]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_log_sum_exp_default_axes_keepdims_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.double)
reduced = np.log(np.sum(np.exp(data), axis=None, keepdims=keepdims == 1))
expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_log_sum_exp_default_axes_keepdims_random",
)
```

</details>


<details>
<summary>do_not_keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([1], dtype=np.int64)
keepdims = 0
node = onnx.helper.make_node(
    "ReduceLogSumExp",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array(
    [[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]], dtype=np.double
)
reduced = np.log(np.sum(np.exp(data), axis=tuple(axes), keepdims=keepdims == 1))
# print(reduced)
# [[20., 2.31326175]
# [40.00004578, 2.31326175]
# [60.00671387, 2.31326175]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_log_sum_exp_do_not_keepdims_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.double)
reduced = np.log(np.sum(np.exp(data), axis=tuple(axes), keepdims=keepdims == 1))

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_log_sum_exp_do_not_keepdims_random",
)
```

</details>


<details>
<summary>empty_set</summary>

```python
shape = [2, 0, 4]
keepdims = 1
reduced_shape = [2, 1, 4]

node = onnx.helper.make_node(
    "ReduceLogSumExp",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array([], dtype=np.float32).reshape(shape)
axes = np.array([1], dtype=np.int64)
zero = np.array(np.zeros(reduced_shape, dtype=np.float32))
reduced = np.log(zero)  # -inf

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_log_sum_exp_empty_set",
)
```

</details>


<details>
<summary>keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([1], dtype=np.int64)
keepdims = 1
node = onnx.helper.make_node(
    "ReduceLogSumExp",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array(
    [[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]], dtype=np.double
)
reduced = np.log(np.sum(np.exp(data), axis=tuple(axes), keepdims=keepdims == 1))
# print(reduced)
# [[[20., 2.31326175]]
# [[40.00004578, 2.31326175]]
# [[60.00671387, 2.31326175]]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_log_sum_exp_keepdims_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.double)
reduced = np.log(np.sum(np.exp(data), axis=tuple(axes), keepdims=keepdims == 1))

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_log_sum_exp_keepdims_random",
)
```

</details>


<details>
<summary>negative_axes_keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([-2], dtype=np.int64)
keepdims = 1
node = onnx.helper.make_node(
    "ReduceLogSumExp",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array(
    [[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]], dtype=np.double
)
reduced = np.log(np.sum(np.exp(data), axis=tuple(axes), keepdims=keepdims == 1))
# print(reduced)
# [[[20., 2.31326175]]
# [[40.00004578, 2.31326175]]
# [[60.00671387, 2.31326175]]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_log_sum_exp_negative_axes_keepdims_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.double)
reduced = np.log(
    np.sum(np.exp(data), axis=tuple(axes.tolist()), keepdims=keepdims == 1)
)

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_log_sum_exp_negative_axes_keepdims_random",
)
```

</details>


### <a name="ReduceMax"></a><a name="reducemax">**ReduceMax**</a>

  Computes the max of the input tensor's elements along the provided axes. The resulting
  tensor has the same rank as the input if `keepdims` equals 1. If `keepdims` equals 0, then
  the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are
  valid. Reduction over an empty set of values yields minus infinity (if supported by the datatype) or the minimum value of the data type otherwise.


  If the input data type is Boolean, the comparison should consider `False < True`.

  The above behavior is similar to numpy, with the exception that numpy defaults `keepdims`
  to `False` instead of `True`.

#### Version

This version of the operator has been available since version 20 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#ReduceMax-1">1</a>, <a href="Changelog.md#ReduceMax-11">11</a>, <a href="Changelog.md#ReduceMax-12">12</a>, <a href="Changelog.md#ReduceMax-13">13</a>, <a href="Changelog.md#ReduceMax-18">18</a>

#### Attributes

<dl>
<dt><tt>keepdims</tt> : int (default is 1)</dt>
<dd>Keep the reduced dimension or not, default 1 means keep reduced dimension.</dd>
<dt><tt>noop_with_empty_axes</tt> : int (default is 0)</dt>
<dd>Defines behavior if 'axes' is empty. Default behavior with 'false' is to reduce all axes. When axes is empty and this attribute is set to true, input tensor will not be reduced,and the output tensor would be equivalent to input tensor.</dd>
</dl>

#### Inputs (1 - 2)

<dl>
<dt><tt>data</tt> (differentiable) : T</dt>
<dd>An input tensor.</dd>
<dt><tt>axes</tt> (optional, non-differentiable) : tensor(int64)</dt>
<dd>Optional input list of integers, along which to reduce. The default is to reduce over all the dimensions of the input tensor if 'noop_with_empty_axes' is false, else act as an Identity op when 'noop_with_empty_axes' is true. Accepted range is [-r, r-1] where r = rank(data).</dd>
</dl>

#### Outputs

<dl>
<dt><tt>reduced</tt> (differentiable) : T</dt>
<dd>Reduced output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16), tensor(uint8), tensor(int8), tensor(bool)</dt>
<dd>Constrain input and output types to numeric and Boolean tensors.</dd>
</dl>


#### Examples

<details>
<summary>bool_inputs</summary>

```python
axes = np.array([1], dtype=np.int64)
keepdims = 1

node = onnx.helper.make_node(
    "ReduceMax",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array(
    [[True, True], [True, False], [False, True], [False, False]],
)
reduced = np.maximum.reduce(data, axis=tuple(axes), keepdims=bool(keepdims))
# print(reduced)
# [[True],
#  [True],
#  [True],
#  [False]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_max_bool_inputs",
)
```

</details>


<details>
<summary>default_axes_keepdims</summary>

```python
shape = [3, 2, 2]
axes = None
keepdims = 1
node = onnx.helper.make_node(
    "ReduceMax", inputs=["data"], outputs=["reduced"], keepdims=keepdims
)

data = np.array(
    [[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]],
    dtype=np.float32,
)
reduced = np.maximum.reduce(data, axis=axes, keepdims=keepdims == 1)

expect(
    node,
    inputs=[data],
    outputs=[reduced],
    name="test_reduce_max_default_axes_keepdim_example",
    opset_imports=[onnx.helper.make_opsetid("", 18)],
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.maximum.reduce(data, axis=axes, keepdims=keepdims == 1)

expect(
    node,
    inputs=[data],
    outputs=[reduced],
    name="test_reduce_max_default_axes_keepdims_random",
    opset_imports=[onnx.helper.make_opsetid("", 18)],
)
```

</details>


<details>
<summary>do_not_keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([1], dtype=np.int64)
keepdims = 0

node = onnx.helper.make_node(
    "ReduceMax",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array(
    [[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]],
    dtype=np.float32,
)
reduced = np.maximum.reduce(data, axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[20., 2.]
# [40., 2.]
# [60., 2.]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_max_do_not_keepdims_example",
    opset_imports=[onnx.helper.make_opsetid("", 18)],
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.maximum.reduce(data, axis=tuple(axes), keepdims=keepdims == 1)

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_max_do_not_keepdims_random",
    opset_imports=[onnx.helper.make_opsetid("", 18)],
)
```

</details>


<details>
<summary>empty_set</summary>

```python
shape = [2, 0, 4]
keepdims = 1
reduced_shape = [2, 1, 4]

node = onnx.helper.make_node(
    "ReduceMax",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array([], dtype=np.float32).reshape(shape)
axes = np.array([1], dtype=np.int64)
one = np.array(np.ones(reduced_shape, dtype=np.float32))
zero = np.array(np.zeros(reduced_shape, dtype=np.float32))
reduced = -(one / zero)  # -inf

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_min_empty_set",
)
```

</details>


<details>
<summary>keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([1], dtype=np.int64)
keepdims = 1

node = onnx.helper.make_node(
    "ReduceMax",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array(
    [[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]],
    dtype=np.float32,
)
reduced = np.maximum.reduce(data, axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[[20., 2.]]
# [[40., 2.]]
# [[60., 2.]]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_max_keepdims_example",
    opset_imports=[onnx.helper.make_opsetid("", 18)],
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.maximum.reduce(data, axis=tuple(axes), keepdims=keepdims == 1)

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_max_keepdims_random",
    opset_imports=[onnx.helper.make_opsetid("", 18)],
)
```

</details>


<details>
<summary>negative_axes_keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([-2], dtype=np.int64)
keepdims = 1

node = onnx.helper.make_node(
    "ReduceMax",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array(
    [[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]],
    dtype=np.float32,
)
reduced = np.maximum.reduce(data, axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[[20., 2.]]
# [[40., 2.]]
# [[60., 2.]]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_max_negative_axes_keepdims_example",
    opset_imports=[onnx.helper.make_opsetid("", 18)],
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.maximum.reduce(data, axis=tuple(axes), keepdims=keepdims == 1)

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_max_negative_axes_keepdims_random",
    opset_imports=[onnx.helper.make_opsetid("", 18)],
)
```

</details>


### <a name="ReduceMean"></a><a name="reducemean">**ReduceMean**</a>

  Computes the mean of the input tensor's elements along the provided axes. The resulting
  tensor has the same rank as the input if `keepdims` equals 1. If `keepdims` equals 0, then
  the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are
  valid. Reduction over an empty set of values yields undefined.


  The above behavior is similar to numpy, with the exception that numpy defaults `keepdims`
  to `False` instead of `True`.

#### Version

This version of the operator has been available since version 18 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#ReduceMean-1">1</a>, <a href="Changelog.md#ReduceMean-11">11</a>, <a href="Changelog.md#ReduceMean-13">13</a>

#### Attributes

<dl>
<dt><tt>keepdims</tt> : int (default is 1)</dt>
<dd>Keep the reduced dimension or not, default 1 means keep reduced dimension.</dd>
<dt><tt>noop_with_empty_axes</tt> : int (default is 0)</dt>
<dd>Defines behavior if 'axes' is empty. Default behavior with 'false' is to reduce all axes. When axes is empty and this attribute is set to true, input tensor will not be reduced,and the output tensor would be equivalent to input tensor.</dd>
</dl>

#### Inputs (1 - 2)

<dl>
<dt><tt>data</tt> (differentiable) : T</dt>
<dd>An input tensor.</dd>
<dt><tt>axes</tt> (optional, non-differentiable) : tensor(int64)</dt>
<dd>Optional input list of integers, along which to reduce. The default is to reduce over all the dimensions of the input tensor if 'noop_with_empty_axes' is false, else act as an Identity op when 'noop_with_empty_axes' is true. Accepted range is [-r, r-1] where r = rank(data).</dd>
</dl>

#### Outputs

<dl>
<dt><tt>reduced</tt> (differentiable) : T</dt>
<dd>Reduced output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to numeric tensors.</dd>
</dl>


#### Examples

<details>
<summary>default_axes_keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([], dtype=np.int64)
keepdims = 1

node = onnx.helper.make_node(
    "ReduceMean",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array(
    [[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]],
    dtype=np.float32,
)
reduced = np.mean(data, axis=None, keepdims=keepdims == 1)
# print(reduced)
# [[[18.25]]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_mean_default_axes_keepdims_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.mean(data, axis=None, keepdims=keepdims == 1)

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_mean_default_axes_keepdims_random",
)
```

</details>


<details>
<summary>do_not_keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([1], dtype=np.int64)
keepdims = 0

node = onnx.helper.make_node(
    "ReduceMean",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array(
    [[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]],
    dtype=np.float32,
)
reduced = np.mean(data, axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[12.5, 1.5]
# [35., 1.5]
# [57.5, 1.5]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_mean_do_not_keepdims_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.mean(data, axis=tuple(axes), keepdims=keepdims == 1)

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_mean_do_not_keepdims_random",
)
```

</details>


<details>
<summary>keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([1], dtype=np.int64)
keepdims = 1

node = onnx.helper.make_node(
    "ReduceMean",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array(
    [[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]],
    dtype=np.float32,
)
reduced = np.mean(data, axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[[12.5, 1.5]]
# [[35., 1.5]]
# [[57.5, 1.5]]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_mean_keepdims_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.mean(data, axis=tuple(axes), keepdims=keepdims == 1)

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_mean_keepdims_random",
)
```

</details>


<details>
<summary>negative_axes_keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([-2], dtype=np.int64)
keepdims = 1

node = onnx.helper.make_node(
    "ReduceMean",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array(
    [[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]],
    dtype=np.float32,
)
reduced = np.mean(data, axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[[12.5, 1.5]]
# [[35., 1.5]]
# [[57.5, 1.5]]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_mean_negative_axes_keepdims_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.mean(data, axis=tuple(axes), keepdims=keepdims == 1)

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_mean_negative_axes_keepdims_random",
)
```

</details>


### <a name="ReduceMin"></a><a name="reducemin">**ReduceMin**</a>

  Computes the min of the input tensor's elements along the provided axes. The resulting
  tensor has the same rank as the input if `keepdims` equals 1. If `keepdims` equals 0, then
  the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are
  valid. Reduction over an empty set of values yields plus infinity (if supported by the datatype) or the maximum value of the data type otherwise.


  If the input data type is Boolean, the comparison should consider `False < True`.

  The above behavior is similar to numpy, with the exception that numpy defaults `keepdims`
  to `False` instead of `True`.

#### Version

This version of the operator has been available since version 20 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#ReduceMin-1">1</a>, <a href="Changelog.md#ReduceMin-11">11</a>, <a href="Changelog.md#ReduceMin-12">12</a>, <a href="Changelog.md#ReduceMin-13">13</a>, <a href="Changelog.md#ReduceMin-18">18</a>

#### Attributes

<dl>
<dt><tt>keepdims</tt> : int (default is 1)</dt>
<dd>Keep the reduced dimension or not, default 1 means keep reduced dimension.</dd>
<dt><tt>noop_with_empty_axes</tt> : int (default is 0)</dt>
<dd>Defines behavior if 'axes' is empty. Default behavior with 'false' is to reduce all axes. When axes is empty and this attribute is set to true, input tensor will not be reduced,and the output tensor would be equivalent to input tensor.</dd>
</dl>

#### Inputs (1 - 2)

<dl>
<dt><tt>data</tt> (differentiable) : T</dt>
<dd>An input tensor.</dd>
<dt><tt>axes</tt> (optional, non-differentiable) : tensor(int64)</dt>
<dd>Optional input list of integers, along which to reduce. The default is to reduce over all the dimensions of the input tensor if 'noop_with_empty_axes' is false, else act as an Identity op when 'noop_with_empty_axes' is true. Accepted range is [-r, r-1] where r = rank(data).</dd>
</dl>

#### Outputs

<dl>
<dt><tt>reduced</tt> (differentiable) : T</dt>
<dd>Reduced output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16), tensor(uint8), tensor(int8), tensor(bool)</dt>
<dd>Constrain input and output types to numeric and Boolean tensors.</dd>
</dl>


#### Examples

<details>
<summary>bool_inputs</summary>

```python
axes = np.array([1], dtype=np.int64)
keepdims = 1

node = onnx.helper.make_node(
    "ReduceMin",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array(
    [[True, True], [True, False], [False, True], [False, False]],
)
reduced = np.minimum.reduce(data, axis=tuple(axes), keepdims=bool(keepdims))
# print(reduced)
# [[ True],
#  [False],
#  [False],
#  [False]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_min_bool_inputs",
)
```

</details>


<details>
<summary>default_axes_keepdims</summary>

```python
shape = [3, 2, 2]
axes = None
keepdims = 1

node = onnx.helper.make_node(
    "ReduceMin", inputs=["data"], outputs=["reduced"], keepdims=keepdims
)

data = np.array(
    [[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]],
    dtype=np.float32,
)
reduced = np.minimum.reduce(data, axis=axes, keepdims=keepdims == 1)
# print(reduced)
# [[[1.]]]

expect(
    node,
    inputs=[data],
    outputs=[reduced],
    name="test_reduce_min_default_axes_keepdims_example",
    opset_imports=[onnx.helper.make_opsetid("", 18)],
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.minimum.reduce(data, axis=axes, keepdims=keepdims == 1)

expect(
    node,
    inputs=[data],
    outputs=[reduced],
    name="test_reduce_min_default_axes_keepdims_random",
    opset_imports=[onnx.helper.make_opsetid("", 18)],
)
```

</details>


<details>
<summary>do_not_keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([1], dtype=np.int64)
keepdims = 0

node = onnx.helper.make_node(
    "ReduceMin",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array(
    [[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]],
    dtype=np.float32,
)
reduced = np.minimum.reduce(data, axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[5., 1.]
# [30., 1.]
# [55., 1.]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_min_do_not_keepdims_example",
    opset_imports=[onnx.helper.make_opsetid("", 18)],
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.minimum.reduce(data, axis=tuple(axes), keepdims=keepdims == 1)

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_min_do_not_keepdims_random",
    opset_imports=[onnx.helper.make_opsetid("", 18)],
)
```

</details>


<details>
<summary>empty_set</summary>

```python
shape = [2, 0, 4]
keepdims = 1
reduced_shape = [2, 1, 4]

node = onnx.helper.make_node(
    "ReduceMin",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array([], dtype=np.float32).reshape(shape)
axes = np.array([1], dtype=np.int64)
one = np.array(np.ones(reduced_shape, dtype=np.float32))
zero = np.array(np.zeros(reduced_shape, dtype=np.float32))
reduced = one / zero  # inf

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_min_empty_set",
)
```

</details>


<details>
<summary>keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([1], dtype=np.int64)
keepdims = 1

node = onnx.helper.make_node(
    "ReduceMin",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array(
    [[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]],
    dtype=np.float32,
)
reduced = np.minimum.reduce(data, axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[[5., 1.]]
# [[30., 1.]]
# [[55., 1.]]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_min_keepdims_example",
    opset_imports=[onnx.helper.make_opsetid("", 18)],
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.minimum.reduce(data, axis=tuple(axes), keepdims=keepdims == 1)

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_min_keepdims_random",
    opset_imports=[onnx.helper.make_opsetid("", 18)],
)
```

</details>


<details>
<summary>negative_axes_keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([-2], dtype=np.int64)
keepdims = 1

node = onnx.helper.make_node(
    "ReduceMin",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array(
    [[[5, 1], [20, 2]], [[30, 1], [40, 2]], [[55, 1], [60, 2]]],
    dtype=np.float32,
)
reduced = np.minimum.reduce(data, axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[[5., 1.]]
# [[30., 1.]]
# [[55., 1.]]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_min_negative_axes_keepdims_example",
    opset_imports=[onnx.helper.make_opsetid("", 18)],
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.minimum.reduce(data, axis=tuple(axes), keepdims=keepdims == 1)

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_min_negative_axes_keepdims_random",
    opset_imports=[onnx.helper.make_opsetid("", 18)],
)
```

</details>


### <a name="ReduceProd"></a><a name="reduceprod">**ReduceProd**</a>

  Computes the product of the input tensor's elements along the provided axes. The resulting
  tensor has the same rank as the input if `keepdims` equals 1. If `keepdims` equals 0, then
  the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are
  valid. Reduction over an empty set of values yields 1.


  The above behavior is similar to numpy, with the exception that numpy defaults `keepdims`
  to `False` instead of `True`.

#### Version

This version of the operator has been available since version 18 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#ReduceProd-1">1</a>, <a href="Changelog.md#ReduceProd-11">11</a>, <a href="Changelog.md#ReduceProd-13">13</a>

#### Attributes

<dl>
<dt><tt>keepdims</tt> : int (default is 1)</dt>
<dd>Keep the reduced dimension or not, default 1 means keep reduced dimension.</dd>
<dt><tt>noop_with_empty_axes</tt> : int (default is 0)</dt>
<dd>Defines behavior if 'axes' is empty. Default behavior with 'false' is to reduce all axes. When axes is empty and this attribute is set to true, input tensor will not be reduced,and the output tensor would be equivalent to input tensor.</dd>
</dl>

#### Inputs (1 - 2)

<dl>
<dt><tt>data</tt> (differentiable) : T</dt>
<dd>An input tensor.</dd>
<dt><tt>axes</tt> (optional, non-differentiable) : tensor(int64)</dt>
<dd>Optional input list of integers, along which to reduce. The default is to reduce over all the dimensions of the input tensor if 'noop_with_empty_axes' is false, else act as an Identity op when 'noop_with_empty_axes' is true. Accepted range is [-r, r-1] where r = rank(data).</dd>
</dl>

#### Outputs

<dl>
<dt><tt>reduced</tt> (differentiable) : T</dt>
<dd>Reduced output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to numeric tensors.</dd>
</dl>


#### Examples

<details>
<summary>default_axes_keepdims</summary>

```python
shape = [3, 2, 2]
axes = None
keepdims = 1

node = onnx.helper.make_node(
    "ReduceProd", inputs=["data"], outputs=["reduced"], keepdims=keepdims
)

data = np.array(
    [[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]], dtype=np.float32
)
reduced = np.prod(data, axis=axes, keepdims=keepdims == 1)
# print(reduced)
# [[[4.790016e+08]]]

expect(
    node,
    inputs=[data],
    outputs=[reduced],
    name="test_reduce_prod_default_axes_keepdims_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.prod(data, axis=axes, keepdims=keepdims == 1)
expect(
    node,
    inputs=[data],
    outputs=[reduced],
    name="test_reduce_prod_default_axes_keepdims_random",
)
```

</details>


<details>
<summary>do_not_keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([1], dtype=np.int64)
keepdims = 0

node = onnx.helper.make_node(
    "ReduceProd",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array(
    [[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]], dtype=np.float32
)
reduced = np.prod(data, axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[3., 8.]
# [35., 48.]
# [99., 120.]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_prod_do_not_keepdims_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.prod(data, axis=tuple(axes), keepdims=keepdims == 1)
expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_prod_do_not_keepdims_random",
)
```

</details>


<details>
<summary>empty_set</summary>

```python
shape = [2, 0, 4]
keepdims = 1
reduced_shape = [2, 1, 4]

node = onnx.helper.make_node(
    "ReduceProd",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array([], dtype=np.float32).reshape(shape)
axes = np.array([1], dtype=np.int64)
reduced = np.array(np.ones(reduced_shape, dtype=np.float32))

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_prod_empty_set",
)
```

</details>


<details>
<summary>keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([1], dtype=np.int64)
keepdims = 1

node = onnx.helper.make_node(
    "ReduceProd",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array(
    [[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]], dtype=np.float32
)
reduced = np.prod(data, axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[[3., 8.]]
# [[35., 48.]]
# [[99., 120.]]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_prod_keepdims_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.prod(data, axis=tuple(axes), keepdims=keepdims == 1)
expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_prod_keepdims_random",
)
```

</details>


<details>
<summary>negative_axes_keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([-2], dtype=np.int64)
keepdims = 1

node = onnx.helper.make_node(
    "ReduceProd",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array(
    [[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]], dtype=np.float32
)
reduced = np.prod(data, axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[[3., 8.]]
# [[35., 48.]]
# [[99., 120.]]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_prod_negative_axes_keepdims_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.prod(data, axis=tuple(axes), keepdims=keepdims == 1)
expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_prod_negative_axes_keepdims_random",
)
```

</details>


### <a name="ReduceSum"></a><a name="reducesum">**ReduceSum**</a>

  Computes the sum of the input tensor's elements along the provided axes. The resulting
  tensor has the same rank as the input if `keepdims` equals 1. If `keepdims` equals 0, then
  the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are
  valid. Reduction over an empty set of values yields 0.


  The above behavior is similar to numpy, with the exception that numpy defaults `keepdims`
  to `False` instead of `True`.

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#ReduceSum-1">1</a>, <a href="Changelog.md#ReduceSum-11">11</a>

#### Attributes

<dl>
<dt><tt>keepdims</tt> : int (default is 1)</dt>
<dd>Keep the reduced dimension or not, default 1 means keep reduced dimension.</dd>
<dt><tt>noop_with_empty_axes</tt> : int (default is 0)</dt>
<dd>Defines behavior if 'axes' is empty. Default behavior with 'false' is to reduce all axes. When axes is empty and this attribute is set to true, input tensor will not be reduced,and the output tensor would be equivalent to input tensor.</dd>
</dl>

#### Inputs (1 - 2)

<dl>
<dt><tt>data</tt> (differentiable) : T</dt>
<dd>An input tensor.</dd>
<dt><tt>axes</tt> (optional, non-differentiable) : tensor(int64)</dt>
<dd>Optional input list of integers, along which to reduce. The default is to reduce over all the dimensions of the input tensor if 'noop_with_empty_axes' is false, else act as an Identity op when 'noop_with_empty_axes' is true. Accepted range is [-r, r-1] where r = rank(data).</dd>
</dl>

#### Outputs

<dl>
<dt><tt>reduced</tt> (differentiable) : T</dt>
<dd>Reduced output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to numeric tensors.</dd>
</dl>


#### Examples

<details>
<summary>default_axes_keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([], dtype=np.int64)
keepdims = 1

node = onnx.helper.make_node(
    "ReduceSum", inputs=["data", "axes"], outputs=["reduced"], keepdims=keepdims
)

data = np.array(
    [[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]], dtype=np.float32
)
reduced = np.sum(data, axis=None, keepdims=keepdims == 1)
# print(reduced)
# [[[78.]]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_sum_default_axes_keepdims_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sum(data, axis=None, keepdims=keepdims == 1)

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_sum_default_axes_keepdims_random",
)
```

</details>


<details>
<summary>do_not_keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([1], dtype=np.int64)
keepdims = 0

node = onnx.helper.make_node(
    "ReduceSum", inputs=["data", "axes"], outputs=["reduced"], keepdims=keepdims
)

data = np.array(
    [[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]], dtype=np.float32
)
reduced = np.sum(data, axis=tuple(axes.tolist()), keepdims=keepdims == 1)
# print(reduced)
# [[4., 6.]
# [12., 14.]
# [20., 22.]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_sum_do_not_keepdims_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sum(data, axis=tuple(axes.tolist()), keepdims=keepdims == 1)

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_sum_do_not_keepdims_random",
)
```

</details>


<details>
<summary>empty_axes_input_noop</summary>

```python
shape = [3, 2, 2]
keepdims = 1

node = onnx.helper.make_node(
    "ReduceSum",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
    noop_with_empty_axes=True,
)

data = np.array(
    [[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]], dtype=np.float32
)
axes = np.array([], dtype=np.int64)
reduced = np.array(data)
# print(reduced)
# [[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_sum_empty_axes_input_noop_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.array(data)

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_sum_negative_axes_keepdims_random",
)
```

</details>


<details>
<summary>empty_set</summary>

```python
"""Test case with the reduced-axis of size zero."""
shape = [2, 0, 4]
keepdims = 1
reduced_shape = [2, 1, 4]

node = onnx.helper.make_node(
    "ReduceSum",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array([], dtype=np.float32).reshape(shape)
axes = np.array([1], dtype=np.int64)
reduced = np.array(np.zeros(reduced_shape, dtype=np.float32))

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_sum_empty_set",
)
```

</details>


<details>
<summary>keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([1], dtype=np.int64)
keepdims = 1

node = onnx.helper.make_node(
    "ReduceSum", inputs=["data", "axes"], outputs=["reduced"], keepdims=keepdims
)

data = np.array(
    [[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]], dtype=np.float32
)
reduced = np.sum(data, axis=tuple(axes.tolist()), keepdims=keepdims == 1)
# print(reduced)
# [[[4., 6.]]
# [[12., 14.]]
# [[20., 22.]]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_sum_keepdims_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sum(data, axis=tuple(axes.tolist()), keepdims=keepdims == 1)

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_sum_keepdims_random",
)
```

</details>


<details>
<summary>negative_axes_keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([-2], dtype=np.int64)
keepdims = 1

node = onnx.helper.make_node(
    "ReduceSum", inputs=["data", "axes"], outputs=["reduced"], keepdims=keepdims
)

data = np.array(
    [[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]], dtype=np.float32
)
reduced = np.sum(data, axis=tuple(axes.tolist()), keepdims=keepdims == 1)
# print(reduced)
# [[[4., 6.]]
# [[12., 14.]]
# [[20., 22.]]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_sum_negative_axes_keepdims_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sum(data, axis=tuple(axes.tolist()), keepdims=keepdims == 1)

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_sum_negative_axes_keepdims_random",
)
```

</details>


<details>
<summary>non_reduced_axis_zero</summary>

```python
"""Test case with the non-reduced-axis of size zero."""
shape = [2, 0, 4]
keepdims = 1
reduced_shape = [2, 0, 1]

node = onnx.helper.make_node(
    "ReduceSum",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array([], dtype=np.float32).reshape(shape)
axes = np.array([2], dtype=np.int64)
reduced = np.array([], dtype=np.float32).reshape(reduced_shape)

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_sum_empty_set_non_reduced_axis_zero",
)
```

</details>


### <a name="ReduceSumSquare"></a><a name="reducesumsquare">**ReduceSumSquare**</a>

  Computes the sum square of the input tensor's elements along the provided axes. The resulting
  tensor has the same rank as the input if `keepdims` equals 1. If `keepdims` equals 0, then
  the resulting tensor has the reduced dimension pruned. Input tensors of rank zero are
  valid. Reduction over an empty set of values yields 0.


  The above behavior is similar to numpy, with the exception that numpy defaults `keepdims`
  to `False` instead of `True`.

#### Version

This version of the operator has been available since version 18 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#ReduceSumSquare-1">1</a>, <a href="Changelog.md#ReduceSumSquare-11">11</a>, <a href="Changelog.md#ReduceSumSquare-13">13</a>

#### Attributes

<dl>
<dt><tt>keepdims</tt> : int (default is 1)</dt>
<dd>Keep the reduced dimension or not, default 1 means keep reduced dimension.</dd>
<dt><tt>noop_with_empty_axes</tt> : int (default is 0)</dt>
<dd>Defines behavior if 'axes' is empty. Default behavior with 'false' is to reduce all axes. When axes is empty and this attribute is set to true, input tensor will not be reduced,and the output tensor would be equivalent to input tensor.</dd>
</dl>

#### Inputs (1 - 2)

<dl>
<dt><tt>data</tt> (differentiable) : T</dt>
<dd>An input tensor.</dd>
<dt><tt>axes</tt> (optional, non-differentiable) : tensor(int64)</dt>
<dd>Optional input list of integers, along which to reduce. The default is to reduce over all the dimensions of the input tensor if 'noop_with_empty_axes' is false, else act as an Identity op when 'noop_with_empty_axes' is true. Accepted range is [-r, r-1] where r = rank(data).</dd>
</dl>

#### Outputs

<dl>
<dt><tt>reduced</tt> (differentiable) : T</dt>
<dd>Reduced output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint32), tensor(uint64), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to numeric tensors.</dd>
</dl>


#### Examples

<details>
<summary>default_axes_keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([], dtype=np.int64)
keepdims = 1

node = onnx.helper.make_node(
    "ReduceSumSquare",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array(
    [[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]], dtype=np.float32
)
reduced = np.sum(np.square(data), axis=None, keepdims=keepdims == 1)
# print(reduced)
# [[[650.]]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_sum_square_default_axes_keepdims_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sum(np.square(data), axis=None, keepdims=keepdims == 1)

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_sum_square_default_axes_keepdims_random",
)
```

</details>


<details>
<summary>do_not_keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([1], dtype=np.int64)
keepdims = 0

node = onnx.helper.make_node(
    "ReduceSumSquare",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array(
    [[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]], dtype=np.float32
)
reduced = np.sum(np.square(data), axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[10., 20.]
# [74., 100.]
# [202., 244.]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_sum_square_do_not_keepdims_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sum(np.square(data), axis=tuple(axes), keepdims=keepdims == 1)

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_sum_square_do_not_keepdims_random",
)
```

</details>


<details>
<summary>empty_set</summary>

```python
shape = [2, 0, 4]
keepdims = 1
reduced_shape = [2, 1, 4]

node = onnx.helper.make_node(
    "ReduceSumSquare",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array([], dtype=np.float32).reshape(shape)
axes = np.array([1], dtype=np.int64)
reduced = np.array(np.zeros(reduced_shape, dtype=np.float32))

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_sum_square_empty_set",
)
```

</details>


<details>
<summary>keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([1], dtype=np.int64)
keepdims = 1

node = onnx.helper.make_node(
    "ReduceSumSquare",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array(
    [[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]], dtype=np.float32
)
reduced = np.sum(np.square(data), axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[[10., 20.]]
# [[74., 100.]]
# [[202., 244.]]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_sum_square_keepdims_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sum(np.square(data), axis=tuple(axes), keepdims=keepdims == 1)

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_sum_square_keepdims_random",
)
```

</details>


<details>
<summary>negative_axes_keepdims</summary>

```python
shape = [3, 2, 2]
axes = np.array([-2], dtype=np.int64)
keepdims = 1

node = onnx.helper.make_node(
    "ReduceSumSquare",
    inputs=["data", "axes"],
    outputs=["reduced"],
    keepdims=keepdims,
)

data = np.array(
    [[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]], dtype=np.float32
)
reduced = np.sum(np.square(data), axis=tuple(axes), keepdims=keepdims == 1)
# print(reduced)
# [[[10., 20.s]]
# [[74., 100.]]
# [[202., 244.]]]

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_sum_square_negative_axes_keepdims_example",
)

np.random.seed(0)
data = np.random.uniform(-10, 10, shape).astype(np.float32)
reduced = np.sum(np.square(data), axis=tuple(axes), keepdims=keepdims == 1)

expect(
    node,
    inputs=[data, axes],
    outputs=[reduced],
    name="test_reduce_sum_square_negative_axes_keepdims_random",
)
```

</details>


### <a name="RegexFullMatch"></a><a name="regexfullmatch">**RegexFullMatch**</a>

  RegexFullMatch performs a full regex match on each element of the input tensor. If an element fully matches the regex pattern specified as an attribute, the corresponding element in the output is True and it is False otherwise. [RE2](https://github.com/google/re2/wiki/Syntax) regex syntax is used.

#### Version

This version of the operator has been available since version 20 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>pattern</tt> : string</dt>
<dd>Regex pattern to match on. This must be valid RE2 syntax.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> (non-differentiable) : T1</dt>
<dd>Tensor with strings to match on.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (non-differentiable) : T2</dt>
<dd>Tensor of bools indicating if each input string fully matches the regex pattern specified.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(string)</dt>
<dd>Inputs must be UTF-8 strings</dd>
<dt><tt>T2</tt> : tensor(bool)</dt>
<dd>Outputs are bools and are True where there is a full regex match and False otherwise.</dd>
</dl>


#### Examples

<details>
<summary>basic</summary>

```python
node = onnx.helper.make_node(
    "RegexFullMatch",
    inputs=["X"],
    outputs=["Y"],
    pattern=r"www\.[\w.-]+\.\bcom\b",
)

x = np.array(["www.google.com", "www.facebook.com", "www.bbc.co.uk"]).astype(
    object
)
result = np.array([True, True, False])
expect(node, inputs=[x], outputs=[result], name="test_regex_full_match_basic")
```

</details>


<details>
<summary>match_email_domain</summary>

```python
node = onnx.helper.make_node(
    "RegexFullMatch",
    inputs=["X"],
    outputs=["Y"],
    pattern=r"(\W|^)[\w.\-]{0,25}@(yahoo|gmail)\.com(\W|$)",
)

x = np.array(
    [
        ["account@gmail.com", "account@hotmail.com"],
        ["not email", "account2@yahoo.com"],
    ]
).astype(object)
result = np.array([[True, False], [False, True]])
expect(
    node,
    inputs=[x],
    outputs=[result],
    name="test_regex_full_match_email_domain",
)
```

</details>


<details>
<summary>match_empty</summary>

```python
node = onnx.helper.make_node(
    "RegexFullMatch",
    inputs=["X"],
    outputs=["Y"],
    pattern=r"(\W|^)[\w.\-]{0,25}@(yahoo|gmail)\.com(\W|$)",
)

x = np.array([[], []]).astype(object)
result = np.array([[], []]).astype(bool)
expect(
    node,
    inputs=[x],
    outputs=[result],
    name="test_regex_full_match_empty",
)
```

</details>


### <a name="Relu"></a><a name="relu">**Relu**</a>

  Relu takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the rectified linear function, y = max(0, x), is applied to
  the tensor elementwise.

#### Version

This version of the operator has been available since version 14 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Relu-1">1</a>, <a href="Changelog.md#Relu-6">6</a>, <a href="Changelog.md#Relu-13">13</a>

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float), tensor(int32), tensor(int8), tensor(int16), tensor(int64), tensor(float16), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to signed numeric tensors.</dd>
</dl>


#### Examples

<details>
<summary>relu</summary>

```python
node = onnx.helper.make_node(
    "Relu",
    inputs=["x"],
    outputs=["y"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.clip(x, 0, np.inf)

expect(node, inputs=[x], outputs=[y], name="test_relu")
```

</details>


### <a name="Reshape"></a><a name="reshape">**Reshape**</a>

  Reshape the input tensor similar to numpy.reshape.
  First input is the data tensor, second input is a shape tensor which specifies the output shape. It outputs the reshaped tensor.
  At most one dimension of the new shape can be -1. In this case, the value is
  inferred from the size of the tensor and the remaining dimensions. A dimension
  could also be 0, in which case the actual dimension value is unchanged (i.e. taken
  from the input tensor). If 'allowzero' is set, and the new shape includes 0, the
  dimension will be set explicitly to zero (i.e. not taken from input tensor).
  Shape (second input) could be an empty shape, which means converting to a scalar.
  The input tensor's shape and the output tensor's shape are required to have the same number of elements.

  If the attribute 'allowzero' is set, it is invalid for the specified shape to
  contain both a zero value and -1, as the value of the dimension corresponding
  to -1 cannot be determined uniquely.

#### Version

This version of the operator has been available since version 21 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Reshape-1">1</a>, <a href="Changelog.md#Reshape-5">5</a>, <a href="Changelog.md#Reshape-13">13</a>, <a href="Changelog.md#Reshape-14">14</a>, <a href="Changelog.md#Reshape-19">19</a>

#### Attributes

<dl>
<dt><tt>allowzero</tt> : int (default is 0)</dt>
<dd>(Optional) By default, when any value in the 'shape' input is equal to zero the corresponding dimension value is copied from the input tensor dynamically. allowzero=1 indicates that if any value in the 'shape' input is set to zero, the zero value is honored, similar to NumPy.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> (differentiable) : T</dt>
<dd>An input tensor.</dd>
<dt><tt>shape</tt> (non-differentiable) : tensor(int64)</dt>
<dd>Specified shape for output.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>reshaped</tt> (differentiable) : T</dt>
<dd>Reshaped data.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4)</dt>
<dd>Constrain input and output types to all tensor types.</dd>
</dl>


#### Examples

<details>
<summary>allowzero</summary>

```python
original_shape = [0, 3, 4]
test_cases = {
    "allowzero_reordered": np.array([3, 4, 0], dtype=np.int64),
}
data = np.random.random_sample(original_shape).astype(np.float32)

for test_name, shape in test_cases.items():
    node = onnx.helper.make_node(
        "Reshape",
        inputs=["data", "shape"],
        outputs=["reshaped"],
        allowzero=1,  # if allowzero=1, final shape = (3, 4, 0)
        # if allowzero=0, final shape = (3, 4, 4)
    )

    reshaped = reshape_reference_implementation(data, shape, allowzero=1)

    expect(
        node,
        inputs=[data, shape],
        outputs=[reshaped],
        name="test_reshape_" + test_name,
    )
```

</details>


<details>
<summary>reshape</summary>

```python
original_shape = [2, 3, 4]
test_cases = {
    "reordered_all_dims": np.array([4, 2, 3], dtype=np.int64),
    "reordered_last_dims": np.array([2, 4, 3], dtype=np.int64),
    "reduced_dims": np.array([2, 12], dtype=np.int64),
    "extended_dims": np.array([2, 3, 2, 2], dtype=np.int64),
    "one_dim": np.array([24], dtype=np.int64),
    "negative_dim": np.array([2, -1, 2], dtype=np.int64),
    "negative_extended_dims": np.array([-1, 2, 3, 4], dtype=np.int64),
    "zero_dim": np.array([2, 0, 4, 1], dtype=np.int64),
    "zero_and_negative_dim": np.array([2, 0, 1, -1], dtype=np.int64),
}
data = np.random.random_sample(original_shape).astype(np.float32)

for test_name, shape in test_cases.items():
    node = onnx.helper.make_node(
        "Reshape",
        inputs=["data", "shape"],
        outputs=["reshaped"],
    )

    reshaped = reshape_reference_implementation(data, shape)

    expect(
        node,
        inputs=[data, shape],
        outputs=[reshaped],
        name="test_reshape_" + test_name,
    )
```

</details>


### <a name="Resize"></a><a name="resize">**Resize**</a>

  Resize the input tensor. In general, it calculates every value in the output tensor as a weighted average of neighborhood (a.k.a. sampling locations) in the input tensor.
  Each dimension value of the output tensor is:
  ```
  output_dimension = floor(input_dimension * (roi_end - roi_start) * scale)
  ```
  if input \"sizes\" is not specified.

#### Version

This version of the operator has been available since version 19 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Resize-10">10</a>, <a href="Changelog.md#Resize-11">11</a>, <a href="Changelog.md#Resize-13">13</a>, <a href="Changelog.md#Resize-18">18</a>

#### Attributes

<dl>
<dt><tt>antialias</tt> : int (default is 0)</dt>
<dd>If set to 1, "linear" and "cubic" interpolation modes will use an antialiasing filter when downscaling. Antialiasing is achieved by stretching the resampling filter by a factor max(1, 1 / scale), which means that when downsampling, more input pixels contribute to an output pixel.</dd>
<dt><tt>axes</tt> : list of ints</dt>
<dd>If provided, it specifies a subset of axes that 'roi', 'scales' and 'sizes' refer to. If not provided, all axes are assumed [0, 1, ..., r-1], where r = rank(data). Non-specified dimensions are interpreted as non-resizable. Negative value means counting dimensions from the back. Accepted range is [-r, r-1], where r = rank(data). Behavior is undefined if an axis is repeated.</dd>
<dt><tt>coordinate_transformation_mode</tt> : string (default is half_pixel)</dt>
<dd>
This attribute describes how to transform the coordinate in the resized tensor to the coordinate in the original tensor.

The coordinate of each dimension is transformed individually. Let's describe a case using axis x as an example.
Denote `x_resized` as the coordinate of axis x in the resized tensor,
 `x_original` as the coordinate of axis x in the original tensor,
 `length_original` as the length of the original tensor in axis x,
 `length_resized` as the length of the resized tensor in axis x,
 `scale = length_resized / length_original`,
 `output_width` the target length on the axis x which can be a fractional number when it is calculated out of a scale factor,
 and `output_width_int` the effective output width as an integer.

if coordinate_transformation_mode is `"half_pixel"`,
```
x_original = (x_resized + 0.5) / scale - 0.5
```

if coordinate_transformation_mode is `"half_pixel_symmetric"`,
```
adjustment = output_width_int / output_width
center = input_width / 2
offset = center * (1 - adjustment)
x_ori = offset + (x + 0.5) / scale - 0.5
```

if coordinate_transformation_mode is `"pytorch_half_pixel"`,
```
x_original = length_resized > 1 ? (x_resized + 0.5) / scale - 0.5 : 0
```

if coordinate_transformation_mode is `"align_corners"`,
```
x_original = x_resized * (length_original - 1) / (length_resized - 1)
```

if coordinate_transformation_mode is `"asymmetric"`,
```
x_original = x_resized / scale
```

if coordinate_transformation_mode is `"tf_crop_and_resize"`,
```
x_original = length_resized > 1 ? start_x * (length_original - 1) + x_resized * (end_x - start_x) * (length_original - 1) / (length_resized - 1) : 0.5 * (start_x + end_x) * (length_original - 1)
```
.</dd>
<dt><tt>cubic_coeff_a</tt> : float (default is -0.75)</dt>
<dd>The coefficient 'a' used in cubic interpolation. Two common choice are -0.5 (in some cases of TensorFlow) and -0.75 (in PyTorch). Check out Equation (4) in https://ieeexplore.ieee.org/document/1163711 for the details. This attribute is valid only if mode is "cubic".</dd>
<dt><tt>exclude_outside</tt> : int (default is 0)</dt>
<dd>If set to 1, the weight of sampling locations outside the tensor will be set to 0 and the weight will be renormalized so that their sum is 1.0. The default value is 0.</dd>
<dt><tt>extrapolation_value</tt> : float (default is 0.0)</dt>
<dd>When coordinate_transformation_mode is "tf_crop_and_resize" and x_original is outside the range [0, length_original - 1], this value is used as the corresponding output value. Default is 0.0f.</dd>
<dt><tt>keep_aspect_ratio_policy</tt> : string (default is stretch)</dt>
<dd>
This attribute describes how to interpret the `sizes` input with regard to keeping the original aspect ratio of the input, and it is not applicable when
the `scales` input is used.

Given a set of `sizes`, associated with a subset of `axes` (explicitly provided or default), and assuming `d = axes[i]`, with `i` being the index of the provided `sizes`.

If `keep_aspect_ratio_policy` is `"stretch"`, the original aspect ratio is disregarded, and the input is resized to the specified size:
`out_size[d] = sizes[i]`

If `keep_aspect_ratio_policy` is `"not_larger"`, the sizes are adjusted so that no extent of the output is larger than the specified size, while keeping the original aspect ratio:
```
scale = Min(sizes[i] / in_size[d])
out_size[d] = round_int(scale * in_size[i])
```

If `keep_aspect_ratio_policy` is `"not_smaller"`, the sizes are adjusted so that no extent of the output is smaller than the specified size, while keeping the original aspect ratio:
```
scale = Max(sizes[i] / in_size[d])
out_size[d] = round_int(scale * in_size[i])
```

For non-resizable axes (those not specified in `axes`), the output size will be equal to the input size.

Note: `round_int` stands for computing the nearest integer value, rounding halfway cases up.</dd>
<dt><tt>mode</tt> : string (default is nearest)</dt>
<dd>Three interpolation modes: "nearest" (default), "linear" and "cubic". The "linear" mode includes linear interpolation for 1D tensor and N-linear interpolation for N-D tensor (for example, bilinear interpolation for 2D tensor). The "cubic" mode includes cubic interpolation for 1D tensor and N-cubic interpolation for N-D tensor (for example, bicubic interpolation for 2D tensor).</dd>
<dt><tt>nearest_mode</tt> : string (default is round_prefer_floor)</dt>
<dd>Four modes: "round_prefer_floor" (default, as known as round half down), "round_prefer_ceil" (as known as round half up), "floor", "ceil". Only used by nearest interpolation. It indicates how to get "nearest" pixel in input tensor from x_original, so this attribute is valid only if "mode" is "nearest".</dd>
</dl>

#### Inputs (1 - 4)

<dl>
<dt><tt>X</tt> (differentiable) : T1</dt>
<dd>N-D tensor</dd>
<dt><tt>roi</tt> (optional, non-differentiable) : T2</dt>
<dd>1-D tensor given as [start1, ..., startN, end1, ..., endN], where N is the rank of X or the length of axes, if provided. The RoIs' coordinates are normalized in the coordinate system of the input image. It only takes effect when coordinate_transformation_mode is "tf_crop_and_resize"</dd>
<dt><tt>scales</tt> (optional, non-differentiable) : tensor(float)</dt>
<dd>The scale array along each dimension. It takes value greater than 0. If it's less than 1, it's sampling down, otherwise, it's upsampling. The number of elements of 'scales' should be the same as the rank of input 'X' or the length of 'axes', if provided. One of 'scales' and 'sizes' MUST be specified and it is an error if both are specified. If 'sizes' is needed, the user can use an empty string as the name of 'scales' in this operator's input list.</dd>
<dt><tt>sizes</tt> (optional, non-differentiable) : tensor(int64)</dt>
<dd>Target size of the output tensor. Its interpretation depends on the 'keep_aspect_ratio_policy' value.The number of elements of 'sizes' should be the same as the rank of input 'X', or the length of 'axes', if provided. Only one of 'scales' and 'sizes' can be specified. </dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T1</dt>
<dd>N-D tensor after resizing</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain input 'X' and output 'Y' to all tensor types.</dd>
<dt><tt>T2</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain roi type to float or double.</dd>
</dl>


#### Examples

<details>
<summary>resize_downsample_scales_cubic</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "scales"],
    outputs=["Y"],
    mode="cubic",
)

data = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ],
    dtype=np.float32,
)

scales = np.array([1.0, 1.0, 0.8, 0.8], dtype=np.float32)

# [[[[ 1.47119141  2.78125     4.08251953]
#    [ 6.71142578  8.02148438  9.32275391]
#    [11.91650391 13.2265625  14.52783203]]]]
output = interpolate_nd(
    data, lambda x, _: cubic_coeffs(x), scale_factors=scales
).astype(np.float32)

expect(
    node,
    inputs=[data, scales],
    outputs=[output],
    name="test_resize_downsample_scales_cubic",
)
```

</details>


<details>
<summary>resize_downsample_scales_cubic_A_n0p5_exclude_outside</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "scales"],
    outputs=["Y"],
    mode="cubic",
    cubic_coeff_a=-0.5,
    exclude_outside=True,
)

data = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ],
    dtype=np.float32,
)

scales = np.array([1.0, 1.0, 0.8, 0.8], dtype=np.float32)

# [[[[ 1.36812675  2.6695014   4.0133367 ]
#    [ 6.57362535  7.875       9.2188353 ]
#    [11.94896657 13.25034122 14.59417652]]]]
output = interpolate_nd(
    data,
    lambda x, _: cubic_coeffs(x, A=-0.5),
    scale_factors=scales,
    exclude_outside=True,
).astype(np.float32)

expect(
    node,
    inputs=[data, scales],
    outputs=[output],
    name="test_resize_downsample_scales_cubic_A_n0p5_exclude_outside",
)
```

</details>


<details>
<summary>resize_downsample_scales_cubic_align_corners</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "scales"],
    outputs=["Y"],
    mode="cubic",
    coordinate_transformation_mode="align_corners",
)

data = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ],
    dtype=np.float32,
)

scales = np.array([1.0, 1.0, 0.8, 0.8], dtype=np.float32)

# [[[[ 1.          2.39519159  3.79038317]
#    [ 6.58076634  7.97595793  9.37114951]
#    [12.16153268 13.55672427 14.95191585]]]]
output = interpolate_nd(
    data,
    lambda x, _: cubic_coeffs(x),
    scale_factors=scales,
    coordinate_transformation_mode="align_corners",
).astype(np.float32)

expect(
    node,
    inputs=[data, scales],
    outputs=[output],
    name="test_resize_downsample_scales_cubic_align_corners",
)
```

</details>


<details>
<summary>resize_downsample_scales_cubic_antialias</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "scales"],
    outputs=["Y"],
    mode="cubic",
    antialias=1,
)

data = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ],
    dtype=np.float32,
)

scales = np.array([1.0, 1.0, 0.6, 0.6], dtype=np.float32)

# [[[[ 2.5180721  4.2858863]
#    [ 9.589329  11.357142 ]]]]
output = interpolate_nd(
    data, cubic_coeffs_antialias, scale_factors=scales
).astype(np.float32)

expect(
    node,
    inputs=[data, scales],
    outputs=[output],
    name="test_resize_downsample_scales_cubic_antialias",
)
```

</details>


<details>
<summary>resize_downsample_scales_linear</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "scales"],
    outputs=["Y"],
    mode="linear",
)

data = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
            ]
        ]
    ],
    dtype=np.float32,
)

scales = np.array([1.0, 1.0, 0.6, 0.6], dtype=np.float32)

# [[[[2.6666665 4.3333331]]]]
output = interpolate_nd(
    data, lambda x, _: linear_coeffs(x), scale_factors=scales
).astype(np.float32)

expect(
    node,
    inputs=[data, scales],
    outputs=[output],
    name="test_resize_downsample_scales_linear",
)
```

</details>


<details>
<summary>resize_downsample_scales_linear_align_corners</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "scales"],
    outputs=["Y"],
    mode="linear",
    coordinate_transformation_mode="align_corners",
)

data = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
            ]
        ]
    ],
    dtype=np.float32,
)

scales = np.array([1.0, 1.0, 0.6, 0.6], dtype=np.float32)

# [[[[1.       3.142857]]]]
output = interpolate_nd(
    data,
    lambda x, _: linear_coeffs(x),
    scale_factors=scales,
    coordinate_transformation_mode="align_corners",
).astype(np.float32)

expect(
    node,
    inputs=[data, scales],
    outputs=[output],
    name="test_resize_downsample_scales_linear_align_corners",
)
```

</details>


<details>
<summary>resize_downsample_scales_linear_antialias</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "scales"],
    outputs=["Y"],
    mode="linear",
    antialias=1,
)

data = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ],
    dtype=np.float32,
)

scales = np.array([1.0, 1.0, 0.6, 0.6], dtype=np.float32)

# [[[[ 2.875  4.5  ]
#    [ 9.375 11.   ]]]]
output = interpolate_nd(
    data, linear_coeffs_antialias, scale_factors=scales
).astype(np.float32)

expect(
    node,
    inputs=[data, scales],
    outputs=[output],
    name="test_resize_downsample_scales_linear_antialias",
)
```

</details>


<details>
<summary>resize_downsample_scales_linear_half_pixel_symmetric</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "scales"],
    outputs=["Y"],
    mode="linear",
    coordinate_transformation_mode="half_pixel_symmetric",
)

data = np.array([[[[1, 2, 3, 4]]]], dtype=np.float32)
scales = np.array([1.0, 1.0, 1.0, 0.6], dtype=np.float32)

# [[[[1.6666667, 3.3333333]]]]
output = interpolate_nd(
    data,
    lambda x, _: linear_coeffs(x),
    scale_factors=scales,
    coordinate_transformation_mode="half_pixel_symmetric",
).astype(np.float32)

expect(
    node,
    inputs=[data, scales],
    outputs=[output],
    name="test_resize_downsample_scales_linear_half_pixel_symmetric",
)
```

</details>


<details>
<summary>resize_downsample_scales_nearest</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "scales"],
    outputs=["Y"],
    mode="nearest",
)

data = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
            ]
        ]
    ],
    dtype=np.float32,
)

scales = np.array([1.0, 1.0, 0.6, 0.6], dtype=np.float32)

# [[[[1. 3.]]]]
output = interpolate_nd(
    data, lambda x, _: nearest_coeffs(x), scale_factors=scales
).astype(np.float32)

expect(
    node,
    inputs=[data, scales],
    outputs=[output],
    name="test_resize_downsample_scales_nearest",
)
```

</details>


<details>
<summary>resize_downsample_sizes_cubic</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "", "sizes"],
    outputs=["Y"],
    mode="cubic",
)

data = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ],
    dtype=np.float32,
)

sizes = np.array([1, 1, 3, 3], dtype=np.int64)

# [[[[ 1.63078704  3.00462963  4.37847222]
#    [ 7.12615741  8.5         9.87384259]
#    [12.62152778 13.99537037 15.36921296]]]]
output = interpolate_nd(
    data, lambda x, _: cubic_coeffs(x), output_size=sizes
).astype(np.float32)

expect(
    node,
    inputs=[data, sizes],
    outputs=[output],
    name="test_resize_downsample_sizes_cubic",
)
```

</details>


<details>
<summary>resize_downsample_sizes_cubic_antialias</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "", "sizes"],
    outputs=["Y"],
    mode="cubic",
    antialias=1,
)

data = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ],
    dtype=np.float32,
)

sizes = np.array([1, 1, 3, 3], dtype=np.int64)

# [[[[ 1.7750092  3.1200073  4.4650054]
#    [ 7.1550016  8.5        9.844998 ]
#    [12.534994  13.8799925 15.224991 ]]]]
output = interpolate_nd(data, cubic_coeffs_antialias, output_size=sizes).astype(
    np.float32
)

expect(
    node,
    inputs=[data, sizes],
    outputs=[output],
    name="test_resize_downsample_sizes_cubic_antialias",
)
```

</details>


<details>
<summary>resize_downsample_sizes_linear_antialias</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "", "sizes"],
    outputs=["Y"],
    mode="linear",
    antialias=1,
)

data = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ],
    dtype=np.float32,
)

sizes = np.array([1, 1, 3, 3], dtype=np.int64)

# [[[[ 2.3636363  3.590909   4.818182 ]
#    [ 7.2727275  8.5        9.727273 ]
#    [12.181818  13.409091  14.636364 ]]]]
output = interpolate_nd(
    data, linear_coeffs_antialias, output_size=sizes
).astype(np.float32)

expect(
    node,
    inputs=[data, sizes],
    outputs=[output],
    name="test_resize_downsample_sizes_linear_antialias",
)
```

</details>


<details>
<summary>resize_downsample_sizes_linear_pytorch_half_pixel</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "", "sizes"],
    outputs=["Y"],
    mode="linear",
    coordinate_transformation_mode="pytorch_half_pixel",
)

data = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ],
    dtype=np.float32,
)

sizes = np.array([1, 1, 3, 1], dtype=np.int64)

# [[[[ 1.6666666]
#    [ 7.       ]
#    [12.333333 ]]]]
output = interpolate_nd(
    data,
    lambda x, _: linear_coeffs(x),
    output_size=sizes,
    coordinate_transformation_mode="pytorch_half_pixel",
).astype(np.float32)

expect(
    node,
    inputs=[data, sizes],
    outputs=[output],
    name="test_resize_downsample_sizes_linear_pytorch_half_pixel",
)
```

</details>


<details>
<summary>resize_downsample_sizes_nearest</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "", "sizes"],
    outputs=["Y"],
    mode="nearest",
)

data = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
            ]
        ]
    ],
    dtype=np.float32,
)

sizes = np.array([1, 1, 1, 3], dtype=np.int64)

# [[[[1. 2. 4.]]]]
output = interpolate_nd(
    data, lambda x, _: nearest_coeffs(x), output_size=sizes
).astype(np.float32)

expect(
    node,
    inputs=[data, sizes],
    outputs=[output],
    name="test_resize_downsample_sizes_nearest",
)
```

</details>


<details>
<summary>resize_downsample_sizes_nearest_not_larger</summary>

```python
keep_aspect_ratio_policy = "not_larger"
axes = [2, 3]
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "", "sizes"],
    outputs=["Y"],
    mode="nearest",
    axes=axes,
    keep_aspect_ratio_policy=keep_aspect_ratio_policy,
)

data = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
            ]
        ]
    ],
    dtype=np.float32,
)

sizes = np.array([1, 3], dtype=np.int64)  # Results in 1x2

# [[[[1. 3.]]]]
output = interpolate_nd(
    data,
    lambda x, _: nearest_coeffs(x),
    output_size=sizes,
    axes=axes,
    keep_aspect_ratio_policy=keep_aspect_ratio_policy,
).astype(np.float32)

expect(
    node,
    inputs=[data, sizes],
    outputs=[output],
    name="test_resize_downsample_sizes_nearest_not_larger",
)
```

</details>


<details>
<summary>resize_downsample_sizes_nearest_not_smaller</summary>

```python
keep_aspect_ratio_policy = "not_smaller"
axes = [2, 3]
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "", "sizes"],
    outputs=["Y"],
    mode="nearest",
    axes=axes,
    keep_aspect_ratio_policy=keep_aspect_ratio_policy,
)

data = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
            ]
        ]
    ],
    dtype=np.float32,
)

sizes = np.array([1, 3], dtype=np.int64)  # Results in 2x3

# [[[[1. 2. 4.]
#    [5. 6. 8.]]]]
output = interpolate_nd(
    data,
    lambda x, _: nearest_coeffs(x),
    output_size=sizes,
    axes=axes,
    keep_aspect_ratio_policy=keep_aspect_ratio_policy,
).astype(np.float32)

expect(
    node,
    inputs=[data, sizes],
    outputs=[output],
    name="test_resize_downsample_sizes_nearest_not_smaller",
)
```

</details>


<details>
<summary>resize_tf_crop_and_resize</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "roi", "", "sizes"],
    outputs=["Y"],
    mode="linear",
    coordinate_transformation_mode="tf_crop_and_resize",
)

data = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ],
    dtype=np.float32,
)

# Note: for some rois, the result may be different with that of TF for inaccurate floating point
roi = np.array([0, 0, 0.4, 0.6, 1, 1, 0.6, 0.8], dtype=np.float32)
sizes = np.array([1, 1, 3, 3], dtype=np.int64)

# [[[[ 7.6000004  7.9        8.2      ]
#    [ 8.8        9.1        9.400001 ]
#    [10.        10.3       10.6      ]]]]
output = interpolate_nd(
    data,
    lambda x, _: linear_coeffs(x),
    output_size=sizes,
    roi=roi,
    coordinate_transformation_mode="tf_crop_and_resize",
).astype(np.float32)

expect(
    node,
    inputs=[data, roi, sizes],
    outputs=[output],
    name="test_resize_tf_crop_and_resize",
)
```

</details>


<details>
<summary>resize_tf_crop_and_resize_axes_2_3</summary>

```python
axes = [2, 3]
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "roi", "", "sizes"],
    outputs=["Y"],
    mode="linear",
    coordinate_transformation_mode="tf_crop_and_resize",
    axes=axes,
)

data = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ],
    dtype=np.float32,
)

# Note: for some rois, the result may be different with that of TF for inaccurate floating point
roi = np.array([0.4, 0.6, 0.6, 0.8], dtype=np.float32)
sizes = np.array([3, 3], dtype=np.int64)

# [[[[ 7.6000004  7.9        8.2      ]
#    [ 8.8        9.1        9.400001 ]
#    [10.        10.3       10.6      ]]]]
output = interpolate_nd(
    data,
    lambda x, _: linear_coeffs(x),
    output_size=sizes,
    roi=roi,
    axes=axes,
    coordinate_transformation_mode="tf_crop_and_resize",
).astype(np.float32)

expect(
    node,
    inputs=[data, roi, sizes],
    outputs=[output],
    name="test_resize_tf_crop_and_resize_axes_2_3",
)
```

</details>


<details>
<summary>resize_tf_crop_and_resize_axes_3_2</summary>

```python
axes = [3, 2]
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "roi", "", "sizes"],
    outputs=["Y"],
    mode="linear",
    coordinate_transformation_mode="tf_crop_and_resize",
    axes=axes,
)

data = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ],
    dtype=np.float32,
)

# Note: for some rois, the result may be different with that of TF for inaccurate floating point
roi = np.array([0.6, 0.4, 0.8, 0.6], dtype=np.float32)
sizes = np.array([3, 3], dtype=np.int64)

# [[[[ 7.6000004  7.9        8.2      ]
#    [ 8.8        9.1        9.400001 ]
#    [10.        10.3       10.6      ]]]]
output = interpolate_nd(
    data,
    lambda x, _: linear_coeffs(x),
    output_size=sizes,
    roi=roi,
    axes=axes,
    coordinate_transformation_mode="tf_crop_and_resize",
).astype(np.float32)

expect(
    node,
    inputs=[data, roi, sizes],
    outputs=[output],
    name="test_resize_tf_crop_and_resize_axes_3_2",
)
```

</details>


<details>
<summary>resize_tf_crop_and_resize_extrapolation_value</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "roi", "", "sizes"],
    outputs=["Y"],
    mode="linear",
    coordinate_transformation_mode="tf_crop_and_resize",
    extrapolation_value=10.0,
)

data = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ],
    dtype=np.float32,
)

# Note: for some rois, the result may be different with that of TF for inaccurate floating point
roi = np.array([0, 0, 0.4, 0.6, 1, 1, 1.2, 1.7], dtype=np.float32)
sizes = np.array([1, 1, 3, 3], dtype=np.int64)

# [[[[ 7.6000004 10.        10.       ]
#    [12.400001  10.        10.       ]
#    [10.        10.        10.       ]]]]
output = interpolate_nd(
    data,
    lambda x, _: linear_coeffs(x),
    output_size=sizes,
    roi=roi,
    coordinate_transformation_mode="tf_crop_and_resize",
    extrapolation_value=10.0,
).astype(np.float32)

expect(
    node,
    inputs=[data, roi, sizes],
    outputs=[output],
    name="test_resize_tf_crop_and_resize",
)
```

</details>


<details>
<summary>resize_upsample_scales_cubic</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "scales"],
    outputs=["Y"],
    mode="cubic",
)

data = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ],
    dtype=np.float32,
)

scales = np.array([1.0, 1.0, 2.0, 2.0], dtype=np.float32)

# [[[[ 0.47265625  0.76953125  1.24609375  1.875       2.28125
#      2.91015625  3.38671875  3.68359375]
#    [ 1.66015625  1.95703125  2.43359375  3.0625      3.46875
#      4.09765625  4.57421875  4.87109375]
#    [ 3.56640625  3.86328125  4.33984375  4.96875     5.375
#      6.00390625  6.48046875  6.77734375]
#    [ 6.08203125  6.37890625  6.85546875  7.484375    7.890625
#      8.51953125  8.99609375  9.29296875]
#    [ 7.70703125  8.00390625  8.48046875  9.109375    9.515625
#     10.14453125 10.62109375 10.91796875]
#    [10.22265625 10.51953125 10.99609375 11.625      12.03125
#     12.66015625 13.13671875 13.43359375]
#    [12.12890625 12.42578125 12.90234375 13.53125    13.9375
#     14.56640625 15.04296875 15.33984375]
#    [13.31640625 13.61328125 14.08984375 14.71875    15.125
#     15.75390625 16.23046875 16.52734375]]]]
output = interpolate_nd(
    data, lambda x, _: cubic_coeffs(x), scale_factors=scales
).astype(np.float32)

expect(
    node,
    inputs=[data, scales],
    outputs=[output],
    name="test_resize_upsample_scales_cubic",
)
```

</details>


<details>
<summary>resize_upsample_scales_cubic_A_n0p5_exclude_outside</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "scales"],
    outputs=["Y"],
    mode="cubic",
    cubic_coeff_a=-0.5,
    exclude_outside=True,
)

data = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ],
    dtype=np.float32,
)

scales = np.array([1.0, 1.0, 2.0, 2.0], dtype=np.float32)

# [[[[ 0.55882353  0.81494204  1.35698249  1.89705882  2.39705882
#      2.93713516  3.47917561  3.73529412]
#    [ 1.58329755  1.83941606  2.38145651  2.92153285  3.42153285
#      3.96160918  4.50364964  4.75976814]
#    [ 3.75145936  4.00757787  4.54961832  5.08969466  5.58969466
#      6.12977099  6.67181144  6.92792995]
#    [ 5.91176471  6.16788321  6.70992366  7.25        7.75
#      8.29007634  8.83211679  9.08823529]
#    [ 7.91176471  8.16788321  8.70992366  9.25        9.75
#     10.29007634 10.83211679 11.08823529]
#    [10.07207005 10.32818856 10.87022901 11.41030534 11.91030534
#     12.45038168 12.99242213 13.24854064]
#    [12.24023186 12.49635036 13.03839082 13.57846715 14.07846715
#     14.61854349 15.16058394 15.41670245]
#    [13.26470588 13.52082439 14.06286484 14.60294118 15.10294118
#     15.64301751 16.18505796 16.44117647]]]]
output = interpolate_nd(
    data,
    lambda x, _: cubic_coeffs(x, A=-0.5),
    scale_factors=scales,
    exclude_outside=True,
).astype(np.float32)

expect(
    node,
    inputs=[data, scales],
    outputs=[output],
    name="test_resize_upsample_scales_cubic_A_n0p5_exclude_outside",
)
```

</details>


<details>
<summary>resize_upsample_scales_cubic_align_corners</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "scales"],
    outputs=["Y"],
    mode="cubic",
    coordinate_transformation_mode="align_corners",
)

data = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ],
    dtype=np.float32,
)

scales = np.array([1.0, 1.0, 2.0, 2.0], dtype=np.float32)

# [[[[ 1.          1.34110787  1.80029155  2.32944606  2.67055394
#      3.19970845  3.65889213  4.        ]
#    [ 2.36443149  2.70553936  3.16472303  3.69387755  4.03498542
#      4.56413994  5.02332362  5.36443149]
#    [ 4.20116618  4.54227405  5.00145773  5.53061224  5.87172012
#      6.40087464  6.86005831  7.20116618]
#    [ 6.31778426  6.65889213  7.1180758   7.64723032  7.98833819
#      8.51749271  8.97667638  9.31778426]
#    [ 7.68221574  8.02332362  8.48250729  9.01166181  9.35276968
#      9.8819242  10.34110787 10.68221574]
#    [ 9.79883382 10.13994169 10.59912536 11.12827988 11.46938776
#     11.99854227 12.45772595 12.79883382]
#    [11.63556851 11.97667638 12.43586006 12.96501458 13.30612245
#     13.83527697 14.29446064 14.63556851]
#    [13.         13.34110787 13.80029155 14.32944606 14.67055394
#     15.19970845 15.65889213 16.        ]]]]
output = interpolate_nd(
    data,
    lambda x, _: cubic_coeffs(x),
    scale_factors=scales,
    coordinate_transformation_mode="align_corners",
).astype(np.float32)

expect(
    node,
    inputs=[data, scales],
    outputs=[output],
    name="test_resize_upsample_scales_cubic_align_corners",
)
```

</details>


<details>
<summary>resize_upsample_scales_cubic_asymmetric</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "scales"],
    outputs=["Y"],
    mode="cubic",
    coordinate_transformation_mode="asymmetric",
)

data = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ],
    dtype=np.float32,
)

scales = np.array([1.0, 1.0, 2.0, 2.0], dtype=np.float32)

# [[[[ 1.       1.40625  2.       2.5      3.       3.59375  4.
#      4.09375]
#    [ 2.625    3.03125  3.625    4.125    4.625    5.21875  5.625
#      5.71875]
#    [ 5.       5.40625  6.       6.5      7.       7.59375  8.
#      8.09375]
#    [ 7.       7.40625  8.       8.5      9.       9.59375 10.
#     10.09375]
#    [ 9.       9.40625 10.      10.5     11.      11.59375 12.
#     12.09375]
#    [11.375   11.78125 12.375   12.875   13.375   13.96875 14.375
#     14.46875]
#    [13.      13.40625 14.      14.5     15.      15.59375 16.
#     16.09375]
#    [13.375   13.78125 14.375   14.875   15.375   15.96875 16.375
#     16.46875]]]]
output = interpolate_nd(
    data,
    lambda x, _: cubic_coeffs(x, A=-0.75),
    scale_factors=scales,
    coordinate_transformation_mode="asymmetric",
).astype(np.float32)

expect(
    node,
    inputs=[data, scales],
    outputs=[output],
    name="test_resize_upsample_scales_cubic_asymmetric",
)
```

</details>


<details>
<summary>resize_upsample_scales_linear</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "scales"],
    outputs=["Y"],
    mode="linear",
)

data = np.array(
    [
        [
            [
                [1, 2],
                [3, 4],
            ]
        ]
    ],
    dtype=np.float32,
)

scales = np.array([1.0, 1.0, 2.0, 2.0], dtype=np.float32)

# [[[[1.   1.25 1.75 2.  ]
#    [1.5  1.75 2.25 2.5 ]
#    [2.5  2.75 3.25 3.5 ]
#    [3.   3.25 3.75 4.  ]]]]
output = interpolate_nd(
    data, lambda x, _: linear_coeffs(x), scale_factors=scales
).astype(np.float32)

expect(
    node,
    inputs=[data, scales],
    outputs=[output],
    name="test_resize_upsample_scales_linear",
)
```

</details>


<details>
<summary>resize_upsample_scales_linear_align_corners</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "scales"],
    outputs=["Y"],
    mode="linear",
    coordinate_transformation_mode="align_corners",
)

data = np.array(
    [
        [
            [
                [1, 2],
                [3, 4],
            ]
        ]
    ],
    dtype=np.float32,
)

scales = np.array([1.0, 1.0, 2.0, 2.0], dtype=np.float32)

# [[[[1.         1.33333333 1.66666667 2.        ]
#    [1.66666667 2.         2.33333333 2.66666667]
#    [2.33333333 2.66666667 3.         3.33333333]
#    [3.         3.33333333 3.66666667 4.        ]]]]
output = interpolate_nd(
    data,
    lambda x, _: linear_coeffs(x),
    scale_factors=scales,
    coordinate_transformation_mode="align_corners",
).astype(np.float32)

expect(
    node,
    inputs=[data, scales],
    outputs=[output],
    name="test_resize_upsample_scales_linear_align_corners",
)
```

</details>


<details>
<summary>resize_upsample_scales_linear_half_pixel_symmetric</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "scales"],
    outputs=["Y"],
    mode="linear",
    coordinate_transformation_mode="half_pixel_symmetric",
)

data = np.array([[[[1, 2], [3, 4]]]], dtype=np.float32)
scales = np.array([1.0, 1.0, 2.3, 2.94], dtype=np.float32)

# [[[[1.        , 1.15986395, 1.5       , 1.84013605, 2.        ],
#    [1.56521738, 1.72508133, 2.06521738, 2.40535343, 2.56521738],
#    [2.43478262, 2.59464657, 2.93478262, 3.27491867, 3.43478262],
#    [3.        , 3.15986395, 3.5       , 3.84013605, 4.        ]]]]
output = interpolate_nd(
    data,
    lambda x, _: linear_coeffs(x),
    scale_factors=scales,
    coordinate_transformation_mode="half_pixel_symmetric",
).astype(np.float32)

expect(
    node,
    inputs=[data, scales],
    outputs=[output],
    name="test_resize_upsample_scales_linear_half_pixel_symmetric",
)
```

</details>


<details>
<summary>resize_upsample_scales_nearest</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "scales"],
    outputs=["Y"],
    mode="nearest",
)

data = np.array(
    [
        [
            [
                [1, 2],
                [3, 4],
            ]
        ]
    ],
    dtype=np.float32,
)

scales = np.array([1.0, 1.0, 2.0, 3.0], dtype=np.float32)

# [[[[1. 1. 1. 2. 2. 2.]
#    [1. 1. 1. 2. 2. 2.]
#    [3. 3. 3. 4. 4. 4.]
#    [3. 3. 3. 4. 4. 4.]]]]
output = interpolate_nd(
    data, lambda x, _: nearest_coeffs(x), scale_factors=scales
).astype(np.float32)

expect(
    node,
    inputs=[data, scales],
    outputs=[output],
    name="test_resize_upsample_scales_nearest",
)
```

</details>


<details>
<summary>resize_upsample_scales_nearest_axes_2_3</summary>

```python
axes = [2, 3]
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "scales"],
    outputs=["Y"],
    mode="nearest",
    axes=axes,
)

data = np.array(
    [
        [
            [
                [1, 2],
                [3, 4],
            ]
        ]
    ],
    dtype=np.float32,
)

scales = np.array([2.0, 3.0], dtype=np.float32)

# [[[[1. 1. 1. 2. 2. 2.]
#    [1. 1. 1. 2. 2. 2.]
#    [3. 3. 3. 4. 4. 4.]
#    [3. 3. 3. 4. 4. 4.]]]]
output = interpolate_nd(
    data, lambda x, _: nearest_coeffs(x), scale_factors=scales, axes=axes
).astype(np.float32)

expect(
    node,
    inputs=[data, scales],
    outputs=[output],
    name="test_resize_upsample_scales_nearest_axes_2_3",
)
```

</details>


<details>
<summary>resize_upsample_scales_nearest_axes_3_2</summary>

```python
axes = [3, 2]
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "scales"],
    outputs=["Y"],
    mode="nearest",
    axes=axes,
)

data = np.array(
    [
        [
            [
                [1, 2],
                [3, 4],
            ]
        ]
    ],
    dtype=np.float32,
)

scales = np.array([3.0, 2.0], dtype=np.float32)

# [[[[1. 1. 1. 2. 2. 2.]
#    [1. 1. 1. 2. 2. 2.]
#    [3. 3. 3. 4. 4. 4.]
#    [3. 3. 3. 4. 4. 4.]]]]
output = interpolate_nd(
    data, lambda x, _: nearest_coeffs(x), scale_factors=scales, axes=axes
).astype(np.float32)

expect(
    node,
    inputs=[data, scales],
    outputs=[output],
    name="test_resize_upsample_scales_nearest_axes_3_2",
)
```

</details>


<details>
<summary>resize_upsample_sizes_cubic</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "", "sizes"],
    outputs=["Y"],
    mode="cubic",
)

data = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ],
    dtype=np.float32,
)

sizes = np.array([1, 1, 9, 10], dtype=np.int64)

# [[[[ 0.45507922  0.64057922  0.97157922  1.42257922  1.90732922
#      2.22332922  2.70807922  3.15907922  3.49007922  3.67557922]
#    [ 1.39437963  1.57987963  1.91087963  2.36187963  2.84662963
#      3.16262963  3.64737963  4.09837963  4.42937963  4.61487963]
#    [ 2.95130693  3.13680693  3.46780693  3.91880693  4.40355693
#      4.71955693  5.20430693  5.65530693  5.98630693  6.17180693]
#    [ 5.20525069  5.39075069  5.72175069  6.17275069  6.65750069
#      6.97350069  7.45825069  7.90925069  8.24025069  8.42575069]
#    [ 6.88975     7.07525     7.40625     7.85725     8.342
#      8.658       9.14275     9.59375     9.92475    10.11025   ]
#    [ 8.57424931  8.75974931  9.09074931  9.54174931 10.02649931
#     10.34249931 10.82724931 11.27824931 11.60924931 11.79474931]
#    [10.82819307 11.01369307 11.34469307 11.79569307 12.28044307
#     12.59644307 13.08119307 13.53219307 13.86319307 14.04869307]
#    [12.38512037 12.57062037 12.90162037 13.35262037 13.83737037
#     14.15337037 14.63812037 15.08912037 15.42012037 15.60562037]
#    [13.32442078 13.50992078 13.84092078 14.29192078 14.77667078
#     15.09267078 15.57742078 16.02842078 16.35942078 16.54492078]]]]
output = interpolate_nd(
    data, lambda x, _: cubic_coeffs(x), output_size=sizes
).astype(np.float32)

expect(
    node,
    inputs=[data, sizes],
    outputs=[output],
    name="test_resize_upsample_sizes_cubic",
)
```

</details>


<details>
<summary>resize_upsample_sizes_nearest</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "", "sizes"],
    outputs=["Y"],
    mode="nearest",
)

data = np.array(
    [
        [
            [
                [1, 2],
                [3, 4],
            ]
        ]
    ],
    dtype=np.float32,
)

sizes = np.array([1, 1, 7, 8], dtype=np.int64)

# [[[[1. 1. 1. 1. 2. 2. 2. 2.]
#    [1. 1. 1. 1. 2. 2. 2. 2.]
#    [1. 1. 1. 1. 2. 2. 2. 2.]
#    [1. 1. 1. 1. 2. 2. 2. 2.]
#    [3. 3. 3. 3. 4. 4. 4. 4.]
#    [3. 3. 3. 3. 4. 4. 4. 4.]
#    [3. 3. 3. 3. 4. 4. 4. 4.]]]]
output = interpolate_nd(
    data, lambda x, _: nearest_coeffs(x), output_size=sizes
).astype(np.float32)

expect(
    node,
    inputs=[data, sizes],
    outputs=[output],
    name="test_resize_upsample_sizes_nearest",
)
```

</details>


<details>
<summary>resize_upsample_sizes_nearest_axes_2_3</summary>

```python
axes = [2, 3]
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "", "sizes"],
    outputs=["Y"],
    mode="nearest",
    axes=axes,
)

data = np.array(
    [
        [
            [
                [1, 2],
                [3, 4],
            ]
        ]
    ],
    dtype=np.float32,
)

sizes = np.array([7, 8], dtype=np.int64)

# [[[[1. 1. 1. 1. 2. 2. 2. 2.]
#    [1. 1. 1. 1. 2. 2. 2. 2.]
#    [1. 1. 1. 1. 2. 2. 2. 2.]
#    [1. 1. 1. 1. 2. 2. 2. 2.]
#    [3. 3. 3. 3. 4. 4. 4. 4.]
#    [3. 3. 3. 3. 4. 4. 4. 4.]
#    [3. 3. 3. 3. 4. 4. 4. 4.]]]]
output = interpolate_nd(
    data, lambda x, _: nearest_coeffs(x), output_size=sizes, axes=axes
).astype(np.float32)

expect(
    node,
    inputs=[data, sizes],
    outputs=[output],
    name="test_resize_upsample_sizes_nearest_axes_2_3",
)
```

</details>


<details>
<summary>resize_upsample_sizes_nearest_axes_3_2</summary>

```python
axes = [3, 2]
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "", "sizes"],
    outputs=["Y"],
    mode="nearest",
    axes=axes,
)

data = np.array(
    [
        [
            [
                [1, 2],
                [3, 4],
            ]
        ]
    ],
    dtype=np.float32,
)

sizes = np.array([8, 7], dtype=np.int64)

# [[[[1. 1. 1. 1. 2. 2. 2. 2.]
#    [1. 1. 1. 1. 2. 2. 2. 2.]
#    [1. 1. 1. 1. 2. 2. 2. 2.]
#    [1. 1. 1. 1. 2. 2. 2. 2.]
#    [3. 3. 3. 3. 4. 4. 4. 4.]
#    [3. 3. 3. 3. 4. 4. 4. 4.]
#    [3. 3. 3. 3. 4. 4. 4. 4.]]]]
output = interpolate_nd(
    data, lambda x, _: nearest_coeffs(x), output_size=sizes, axes=axes
).astype(np.float32)

expect(
    node,
    inputs=[data, sizes],
    outputs=[output],
    name="test_resize_upsample_sizes_nearest_axes_3_2",
)
```

</details>


<details>
<summary>resize_upsample_sizes_nearest_ceil_half_pixel</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "", "sizes"],
    outputs=["Y"],
    mode="nearest",
    coordinate_transformation_mode="half_pixel",
    nearest_mode="ceil",
)

data = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ],
    dtype=np.float32,
)

sizes = np.array([1, 1, 8, 8], dtype=np.int64)

# [[[[ 1.  2.  2.  3.  3.  4.  4.  4.]
#    [ 5.  6.  6.  7.  7.  8.  8.  8.]
#    [ 5.  6.  6.  7.  7.  8.  8.  8.]
#    [ 9. 10. 10. 11. 11. 12. 12. 12.]
#    [ 9. 10. 10. 11. 11. 12. 12. 12.]
#    [13. 14. 14. 15. 15. 16. 16. 16.]
#    [13. 14. 14. 15. 15. 16. 16. 16.]
#    [13. 14. 14. 15. 15. 16. 16. 16.]]]]
output = interpolate_nd(
    data, lambda x, _: nearest_coeffs(x, mode="ceil"), output_size=sizes
).astype(np.float32)

expect(
    node,
    inputs=[data, sizes],
    outputs=[output],
    name="test_resize_upsample_sizes_nearest_ceil_half_pixel",
)
```

</details>


<details>
<summary>resize_upsample_sizes_nearest_floor_align_corners</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "", "sizes"],
    outputs=["Y"],
    mode="nearest",
    coordinate_transformation_mode="align_corners",
    nearest_mode="floor",
)

data = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ],
    dtype=np.float32,
)

sizes = np.array([1, 1, 8, 8], dtype=np.int64)

# [[[[ 1.  1.  1.  2.  2.  3.  3.  4.]
#    [ 1.  1.  1.  2.  2.  3.  3.  4.]
#    [ 1.  1.  1.  2.  2.  3.  3.  4.]
#    [ 5.  5.  5.  6.  6.  7.  7.  8.]
#    [ 5.  5.  5.  6.  6.  7.  7.  8.]
#    [ 9.  9.  9. 10. 10. 11. 11. 12.]
#    [ 9.  9.  9. 10. 10. 11. 11. 12.]
#    [13. 13. 13. 14. 14. 15. 15. 16.]]]]
output = interpolate_nd(
    data,
    lambda x, _: nearest_coeffs(x, mode="floor"),
    output_size=sizes,
    coordinate_transformation_mode="align_corners",
).astype(np.float32)

expect(
    node,
    inputs=[data, sizes],
    outputs=[output],
    name="test_resize_upsample_sizes_nearest_floor_align_corners",
)
```

</details>


<details>
<summary>resize_upsample_sizes_nearest_not_larger</summary>

```python
keep_aspect_ratio_policy = "not_larger"
axes = [2, 3]
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "", "sizes"],
    outputs=["Y"],
    mode="nearest",
    axes=axes,
    keep_aspect_ratio_policy=keep_aspect_ratio_policy,
)

data = np.array(
    [
        [
            [
                [1, 2],
                [3, 4],
            ]
        ]
    ],
    dtype=np.float32,
)

sizes = np.array([7, 8], dtype=np.int64)  # Results in 7x7

# [[[[1. 1. 1. 1. 2. 2. 2.]
#    [1. 1. 1. 1. 2. 2. 2.]
#    [1. 1. 1. 1. 2. 2. 2.]
#    [1. 1. 1. 1. 2. 2. 2.]
#    [3. 3. 3. 3. 4. 4. 4.]
#    [3. 3. 3. 3. 4. 4. 4.]
#    [3. 3. 3. 3. 4. 4. 4.]]]]
output = interpolate_nd(
    data,
    lambda x, _: nearest_coeffs(x),
    output_size=sizes,
    axes=axes,
    keep_aspect_ratio_policy=keep_aspect_ratio_policy,
).astype(np.float32)

expect(
    node,
    inputs=[data, sizes],
    outputs=[output],
    name="test_resize_upsample_sizes_nearest_not_larger",
)
```

</details>


<details>
<summary>resize_upsample_sizes_nearest_not_smaller</summary>

```python
keep_aspect_ratio_policy = "not_smaller"
axes = [2, 3]
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "", "sizes"],
    outputs=["Y"],
    mode="nearest",
    axes=axes,
    keep_aspect_ratio_policy=keep_aspect_ratio_policy,
)

data = np.array(
    [
        [
            [
                [1, 2],
                [3, 4],
            ]
        ]
    ],
    dtype=np.float32,
)

sizes = np.array([7, 8], dtype=np.int64)  # Results in 8x8

# [[[[1. 1. 1. 1. 2. 2. 2. 2.]
#    [1. 1. 1. 1. 2. 2. 2. 2.]
#    [1. 1. 1. 1. 2. 2. 2. 2.]
#    [1. 1. 1. 1. 2. 2. 2. 2.]
#    [3. 3. 3. 3. 4. 4. 4. 4.]
#    [3. 3. 3. 3. 4. 4. 4. 4.]
#    [3. 3. 3. 3. 4. 4. 4. 4.]]]]
output = interpolate_nd(
    data,
    lambda x, _: nearest_coeffs(x),
    output_size=sizes,
    axes=axes,
    keep_aspect_ratio_policy=keep_aspect_ratio_policy,
).astype(np.float32)

expect(
    node,
    inputs=[data, sizes],
    outputs=[output],
    name="test_resize_upsample_sizes_nearest_not_larger",
)
```

</details>


<details>
<summary>resize_upsample_sizes_nearest_round_prefer_ceil_asymmetric</summary>

```python
node = onnx.helper.make_node(
    "Resize",
    inputs=["X", "", "", "sizes"],
    outputs=["Y"],
    mode="nearest",
    coordinate_transformation_mode="asymmetric",
    nearest_mode="round_prefer_ceil",
)

data = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ],
    dtype=np.float32,
)

sizes = np.array([1, 1, 8, 8], dtype=np.int64)

# [[[[ 1.  2.  2.  3.  3.  4.  4.  4.]
#    [ 5.  6.  6.  7.  7.  8.  8.  8.]
#    [ 5.  6.  6.  7.  7.  8.  8.  8.]
#    [ 9. 10. 10. 11. 11. 12. 12. 12.]
#    [ 9. 10. 10. 11. 11. 12. 12. 12.]
#    [13. 14. 14. 15. 15. 16. 16. 16.]
#    [13. 14. 14. 15. 15. 16. 16. 16.]
#    [13. 14. 14. 15. 15. 16. 16. 16.]]]]
output = interpolate_nd(
    data,
    lambda x, _: nearest_coeffs(x, mode="round_prefer_ceil"),
    output_size=sizes,
    coordinate_transformation_mode="asymmetric",
).astype(np.float32)

expect(
    node,
    inputs=[data, sizes],
    outputs=[output],
    name="test_resize_upsample_sizes_nearest_round_prefer_ceil_asymmetric",
)
```

</details>


### <a name="ReverseSequence"></a><a name="reversesequence">**ReverseSequence**</a>

  Reverse batch of sequences having different lengths specified by `sequence_lens`.

  For each slice i iterating on batch axis, the operator reverses the first sequence_lens[i] elements on time axis,
  and copies elements whose index's beyond sequence_lens[i] to the output. So the output slice i contains reversed
  sequences on the first sequence_lens[i] elements, then have original values copied for the other elements.

  Example 1:
    input = [[0.0, 4.0, 8.0,  12.0],
             [1.0, 5.0, 9.0,  13.0],
             [2.0, 6.0, 10.0, 14.0],
             [3.0, 7.0, 11.0, 15.0]]
    sequence_lens = [4, 3, 2, 1]
    time_axis = 0
    batch_axis = 1

    output = [[3.0, 6.0, 9.0,  12.0],
              [2.0, 5.0, 8.0,  13.0],
              [1.0, 4.0, 10.0, 14.0],
              [0.0, 7.0, 11.0, 15.0]]

  Example 2:
    input = [[0.0,  1.0,  2.0,  3.0 ],
             [4.0,  5.0,  6.0,  7.0 ],
             [8.0,  9.0,  10.0, 11.0],
             [12.0, 13.0, 14.0, 15.0]]
    sequence_lens = [1, 2, 3, 4]
    time_axis = 1
    batch_axis = 0

    output = [[0.0,  1.0,  2.0,  3.0 ],
              [5.0,  4.0,  6.0,  7.0 ],
              [10.0, 9.0,  8.0,  11.0],
              [15.0, 14.0, 13.0, 12.0]]

#### Version

This version of the operator has been available since version 10 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>batch_axis</tt> : int (default is 1)</dt>
<dd>(Optional) Specify which axis is batch axis. Must be one of 1 (default), or 0.</dd>
<dt><tt>time_axis</tt> : int (default is 0)</dt>
<dd>(Optional) Specify which axis is time axis. Must be one of 0 (default), or 1.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> : T</dt>
<dd>Tensor of rank r >= 2.</dd>
<dt><tt>sequence_lens</tt> : tensor(int64)</dt>
<dd>Tensor specifying lengths of the sequences in a batch. It has shape `[batch_size]`.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T</dt>
<dd>Tensor with same shape of input.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Input and output types can be of any tensor type.</dd>
</dl>


#### Examples

<details>
<summary>reversesequence_batch</summary>

```python
node = onnx.helper.make_node(
    "ReverseSequence",
    inputs=["x", "sequence_lens"],
    outputs=["y"],
    time_axis=1,
    batch_axis=0,
)
x = np.array(
    [
        [0.0, 1.0, 2.0, 3.0],
        [4.0, 5.0, 6.0, 7.0],
        [8.0, 9.0, 10.0, 11.0],
        [12.0, 13.0, 14.0, 15.0],
    ],
    dtype=np.float32,
)
sequence_lens = np.array([1, 2, 3, 4], dtype=np.int64)

y = np.array(
    [
        [0.0, 1.0, 2.0, 3.0],
        [5.0, 4.0, 6.0, 7.0],
        [10.0, 9.0, 8.0, 11.0],
        [15.0, 14.0, 13.0, 12.0],
    ],
    dtype=np.float32,
)

expect(
    node,
    inputs=[x, sequence_lens],
    outputs=[y],
    name="test_reversesequence_batch",
)
```

</details>


<details>
<summary>reversesequence_time</summary>

```python
node = onnx.helper.make_node(
    "ReverseSequence",
    inputs=["x", "sequence_lens"],
    outputs=["y"],
    time_axis=0,
    batch_axis=1,
)
x = np.array(
    [
        [0.0, 4.0, 8.0, 12.0],
        [1.0, 5.0, 9.0, 13.0],
        [2.0, 6.0, 10.0, 14.0],
        [3.0, 7.0, 11.0, 15.0],
    ],
    dtype=np.float32,
)
sequence_lens = np.array([4, 3, 2, 1], dtype=np.int64)

y = np.array(
    [
        [3.0, 6.0, 9.0, 12.0],
        [2.0, 5.0, 8.0, 13.0],
        [1.0, 4.0, 10.0, 14.0],
        [0.0, 7.0, 11.0, 15.0],
    ],
    dtype=np.float32,
)

expect(
    node,
    inputs=[x, sequence_lens],
    outputs=[y],
    name="test_reversesequence_time",
)
```

</details>


### <a name="RoiAlign"></a><a name="roialign">**RoiAlign**</a>

  Region of Interest (RoI) align operation described in the
  [Mask R-CNN paper](https://arxiv.org/abs/1703.06870).
  RoiAlign consumes an input tensor X and region of interests (rois)
  to apply pooling across each RoI; it produces a 4-D tensor of shape
  (num_rois, C, output_height, output_width).

  RoiAlign is proposed to avoid the misalignment by removing
  quantizations while converting from original image into feature
  map and from feature map into RoI feature; in each ROI bin,
  the value of the sampled locations are computed directly
  through bilinear interpolation.

#### Version

This version of the operator has been available since version 16 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#RoiAlign-10">10</a>

#### Attributes

<dl>
<dt><tt>coordinate_transformation_mode</tt> : string (default is half_pixel)</dt>
<dd>Allowed values are 'half_pixel' and 'output_half_pixel'. Use the value 'half_pixel' to pixel shift the input coordinates by -0.5 (the recommended behavior). Use the value 'output_half_pixel' to omit the pixel shift for the input (use this for a backward-compatible behavior).</dd>
<dt><tt>mode</tt> : string (default is avg)</dt>
<dd>The pooling method. Two modes are supported: 'avg' and 'max'. Default is 'avg'.</dd>
<dt><tt>output_height</tt> : int (default is 1)</dt>
<dd>default 1; Pooled output Y's height.</dd>
<dt><tt>output_width</tt> : int (default is 1)</dt>
<dd>default 1; Pooled output Y's width.</dd>
<dt><tt>sampling_ratio</tt> : int (default is 0)</dt>
<dd>Number of sampling points in the interpolation grid used to compute the output value of each pooled output bin. If > 0, then exactly sampling_ratio x sampling_ratio grid points are used. If == 0, then an adaptive number of grid points are used (computed as ceil(roi_width / output_width), and likewise for height). Default is 0.</dd>
<dt><tt>spatial_scale</tt> : float (default is 1.0)</dt>
<dd>Multiplicative spatial scale factor to translate ROI coordinates from their input spatial scale to the scale used when pooling, i.e., spatial scale of the input feature map X relative to the input image. E.g.; default is 1.0f. </dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : T1</dt>
<dd>Input data tensor from the previous operator; 4-D feature map of shape (N, C, H, W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data.</dd>
<dt><tt>rois</tt> : T1</dt>
<dd>RoIs (Regions of Interest) to pool over; rois is 2-D input of shape (num_rois, 4) given as [[x1, y1, x2, y2], ...]. The RoIs' coordinates are in the coordinate system of the input image. Each coordinate set has a 1:1 correspondence with the 'batch_indices' input.</dd>
<dt><tt>batch_indices</tt> : T2</dt>
<dd>1-D tensor of shape (num_rois,) with each element denoting the index of the corresponding image in the batch.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : T1</dt>
<dd>RoI pooled output, 4-D tensor of shape (num_rois, C, output_height, output_width). The r-th batch element Y[r-1] is a pooled feature map corresponding to the r-th RoI X[r-1].</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain types to float tensors.</dd>
<dt><tt>T2</tt> : tensor(int64)</dt>
<dd>Constrain types to int tensors.</dd>
</dl>


#### Examples

<details>
<summary>roialign_aligned_false</summary>

```python
node = onnx.helper.make_node(
    "RoiAlign",
    inputs=["X", "rois", "batch_indices"],
    outputs=["Y"],
    spatial_scale=1.0,
    output_height=5,
    output_width=5,
    sampling_ratio=2,
    coordinate_transformation_mode="output_half_pixel",
)

X, batch_indices, rois = get_roi_align_input_values()
# (num_rois, C, output_height, output_width)
Y = np.array(
    [
        [
            [
                [0.4664, 0.4466, 0.3405, 0.5688, 0.6068],
                [0.3714, 0.4296, 0.3835, 0.5562, 0.3510],
                [0.2768, 0.4883, 0.5222, 0.5528, 0.4171],
                [0.4713, 0.4844, 0.6904, 0.4920, 0.8774],
                [0.6239, 0.7125, 0.6289, 0.3355, 0.3495],
            ]
        ],
        [
            [
                [0.3022, 0.4305, 0.4696, 0.3978, 0.5423],
                [0.3656, 0.7050, 0.5165, 0.3172, 0.7015],
                [0.2912, 0.5059, 0.6476, 0.6235, 0.8299],
                [0.5916, 0.7389, 0.7048, 0.8372, 0.8893],
                [0.6227, 0.6153, 0.7097, 0.6154, 0.4585],
            ]
        ],
        [
            [
                [0.2384, 0.3379, 0.3717, 0.6100, 0.7601],
                [0.3767, 0.3785, 0.7147, 0.9243, 0.9727],
                [0.5749, 0.5826, 0.5709, 0.7619, 0.8770],
                [0.5355, 0.2566, 0.2141, 0.2796, 0.3600],
                [0.4365, 0.3504, 0.2887, 0.3661, 0.2349],
            ]
        ],
    ],
    dtype=np.float32,
)

expect(
    node,
    inputs=[X, rois, batch_indices],
    outputs=[Y],
    name="test_roialign_aligned_false",
)
```

</details>


<details>
<summary>roialign_aligned_true</summary>

```python
node = onnx.helper.make_node(
    "RoiAlign",
    inputs=["X", "rois", "batch_indices"],
    outputs=["Y"],
    spatial_scale=1.0,
    output_height=5,
    output_width=5,
    sampling_ratio=2,
    coordinate_transformation_mode="half_pixel",
)

X, batch_indices, rois = get_roi_align_input_values()
# (num_rois, C, output_height, output_width)
Y = np.array(
    [
        [
            [
                [0.5178, 0.3434, 0.3229, 0.4474, 0.6344],
                [0.4031, 0.5366, 0.4428, 0.4861, 0.4023],
                [0.2512, 0.4002, 0.5155, 0.6954, 0.3465],
                [0.3350, 0.4601, 0.5881, 0.3439, 0.6849],
                [0.4932, 0.7141, 0.8217, 0.4719, 0.4039],
            ]
        ],
        [
            [
                [0.3070, 0.2187, 0.3337, 0.4880, 0.4870],
                [0.1871, 0.4914, 0.5561, 0.4192, 0.3686],
                [0.1433, 0.4608, 0.5971, 0.5310, 0.4982],
                [0.2788, 0.4386, 0.6022, 0.7000, 0.7524],
                [0.5774, 0.7024, 0.7251, 0.7338, 0.8163],
            ]
        ],
        [
            [
                [0.2393, 0.4075, 0.3379, 0.2525, 0.4743],
                [0.3671, 0.2702, 0.4105, 0.6419, 0.8308],
                [0.5556, 0.4543, 0.5564, 0.7502, 0.9300],
                [0.6626, 0.5617, 0.4813, 0.4954, 0.6663],
                [0.6636, 0.3721, 0.2056, 0.1928, 0.2478],
            ]
        ],
    ],
    dtype=np.float32,
)

expect(
    node,
    inputs=[X, rois, batch_indices],
    outputs=[Y],
    name="test_roialign_aligned_true",
)
```

</details>


<details>
<summary>roialign_mode_max</summary>

```python
X = np.array(
    [
        [
            [
                [
                    0.2764,
                    0.715,
                    0.1958,
                    0.3416,
                    0.4638,
                    0.0259,
                    0.2963,
                    0.6518,
                    0.4856,
                    0.725,
                ],
                [
                    0.9637,
                    0.0895,
                    0.2919,
                    0.6753,
                    0.0234,
                    0.6132,
                    0.8085,
                    0.5324,
                    0.8992,
                    0.4467,
                ],
                [
                    0.3265,
                    0.8479,
                    0.9698,
                    0.2471,
                    0.9336,
                    0.1878,
                    0.4766,
                    0.4308,
                    0.34,
                    0.2162,
                ],
                [
                    0.0206,
                    0.172,
                    0.2155,
                    0.4394,
                    0.0653,
                    0.3406,
                    0.7724,
                    0.3921,
                    0.2541,
                    0.5799,
                ],
                [
                    0.4062,
                    0.2194,
                    0.4473,
                    0.4687,
                    0.7109,
                    0.9327,
                    0.9815,
                    0.632,
                    0.1728,
                    0.6119,
                ],
                [
                    0.3097,
                    0.1283,
                    0.4984,
                    0.5068,
                    0.4279,
                    0.0173,
                    0.4388,
                    0.043,
                    0.4671,
                    0.7119,
                ],
                [
                    0.1011,
                    0.8477,
                    0.4726,
                    0.1777,
                    0.9923,
                    0.4042,
                    0.1869,
                    0.7795,
                    0.9946,
                    0.9689,
                ],
                [
                    0.1366,
                    0.3671,
                    0.7011,
                    0.6234,
                    0.9867,
                    0.5585,
                    0.6985,
                    0.5609,
                    0.8788,
                    0.9928,
                ],
                [
                    0.5697,
                    0.8511,
                    0.6711,
                    0.9406,
                    0.8751,
                    0.7496,
                    0.165,
                    0.1049,
                    0.1559,
                    0.2514,
                ],
                [
                    0.7012,
                    0.4056,
                    0.7879,
                    0.3461,
                    0.0415,
                    0.2998,
                    0.5094,
                    0.3727,
                    0.5482,
                    0.0502,
                ],
            ]
        ]
    ],
    dtype=np.float32,
)
rois = np.array(
    [[0.0, 0.0, 9.0, 9.0], [0.0, 5.0, 4.0, 9.0], [5.0, 5.0, 9.0, 9.0]],
    dtype=np.float32,
)
batch_indices = np.array([0, 0, 0], dtype=np.int64)

Y = np.array(
    [
        [
            [
                [0.3445228, 0.37310338, 0.37865096, 0.446696, 0.37991184],
                [0.4133513, 0.5455125, 0.6651902, 0.55805874, 0.27110294],
                [0.21223956, 0.40924096, 0.8417618, 0.792561, 0.37196714],
                [0.46835402, 0.39741728, 0.8012819, 0.4969306, 0.5495158],
                [0.3595896, 0.5196813, 0.5403741, 0.23814403, 0.19992709],
            ]
        ],
        [
            [
                [0.30517197, 0.5086199, 0.3189761, 0.4054401, 0.47630402],
                [0.50862, 0.8477, 0.37808004, 0.24936005, 0.79384017],
                [0.17620805, 0.29368007, 0.44870415, 0.4987201, 0.63148826],
                [0.51066005, 0.8511, 0.5368801, 0.9406, 0.70008016],
                [0.4487681, 0.51066035, 0.5042561, 0.5643603, 0.42004836],
            ]
        ],
        [
            [
                [0.21062402, 0.3510401, 0.37416005, 0.5967599, 0.46507207],
                [0.32336006, 0.31180006, 0.6236001, 0.9946, 0.7751202],
                [0.35744014, 0.5588001, 0.35897616, 0.7030401, 0.6353923],
                [0.5996801, 0.27940005, 0.17948808, 0.35152006, 0.31769615],
                [0.3598083, 0.40752012, 0.2385281, 0.43856013, 0.26313624],
            ]
        ],
    ],
    dtype=np.float32,
)

node = onnx.helper.make_node(
    "RoiAlign",
    inputs=["X", "rois", "batch_indices"],
    mode="max",
    outputs=["Y"],
    spatial_scale=1.0,
    output_height=5,
    output_width=5,
    sampling_ratio=2,
    coordinate_transformation_mode="output_half_pixel",
)

expect(
    node,
    inputs=[X, rois, batch_indices],
    outputs=[Y],
    name="test_roialign_mode_max",
)
```

</details>


### <a name="Round"></a><a name="round">**Round**</a>

  Round takes one input Tensor and rounds the values, element-wise, meaning
  it finds the nearest integer for each value.
  In case of halves, the rule is to round them to the nearest even integer.
  If input x is integral, +0, -0, NaN,  or infinite, x itself is returned.
  The output tensor has the same shape and type as the input.

  Examples:
  ```
  round([0.9]) = [1.0]
  round([2.5]) = [2.0]
  round([2.3]) = [2.0]
  round([1.5]) = [2.0]
  round([-4.5]) = [-4.0]
  ```

#### Version

This version of the operator has been available since version 11 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>X</tt> (non-differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (non-differentiable) : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>round</summary>

```python
node = onnx.helper.make_node(
    "Round",
    inputs=["x"],
    outputs=["y"],
)

x = np.array(
    [
        0.1,
        0.5,
        0.9,
        1.2,
        1.5,
        1.8,
        2.3,
        2.5,
        2.7,
        -1.1,
        -1.5,
        -1.9,
        -2.2,
        -2.5,
        -2.8,
    ]
).astype(np.float32)
y = np.array(
    [
        0.0,
        0.0,
        1.0,
        1.0,
        2.0,
        2.0,
        2.0,
        2.0,
        3.0,
        -1.0,
        -2.0,
        -2.0,
        -2.0,
        -2.0,
        -3.0,
    ]
).astype(
    np.float32
)  # expected output
expect(node, inputs=[x], outputs=[y], name="test_round")
```

</details>


### <a name="STFT"></a><a name="stft">**STFT**</a>

  Computes the Short-time Fourier Transform of the signal.

#### Version

This version of the operator has been available since version 17 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>onesided</tt> : int (default is 1)</dt>
<dd>If onesided is 1, only values for w in [0, 1, 2, ..., floor(n_fft/2) + 1] are returned because the real-to-complex Fourier transform satisfies the conjugate symmetry, i.e., X[m, w] = X[m,w]=X[m,n_fft-w]*. Note if the input or window tensors are complex, then onesided output is not possible. Enabling onesided with real inputs performs a Real-valued fast Fourier transform (RFFT).When invoked with real or complex valued input, the default value is 1. Values can be 0 or 1.</dd>
</dl>

#### Inputs (2 - 4)

<dl>
<dt><tt>signal</tt> (non-differentiable) : T1</dt>
<dd>Input tensor representing a real or complex valued signal. For real input, the following shape is expected: [batch_size][signal_length][1]. For complex input, the following shape is expected: [batch_size][signal_length][2], where [batch_size][signal_length][0] represents the real component and [batch_size][signal_length][1] represents the imaginary component of the signal.</dd>
<dt><tt>frame_step</tt> (non-differentiable) : T2</dt>
<dd>The number of samples to step between successive DFTs.</dd>
<dt><tt>window</tt> (optional, non-differentiable) : T1</dt>
<dd>A tensor representing the window that will be slid over the signal.The window must have rank 1 with shape: [window_shape]. It's an optional value. </dd>
<dt><tt>frame_length</tt> (optional, non-differentiable) : T2</dt>
<dd>A scalar representing the size of the DFT. It's an optional value.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (non-differentiable) : T1</dt>
<dd>The Short-time Fourier Transform of the signals.If onesided is 1, the output has the shape: [batch_size][frames][dft_unique_bins][2], where dft_unique_bins is frame_length // 2 + 1 (the unique components of the DFT) If onesided is 0, the output has the shape: [batch_size][frames][frame_length][2], where frame_length is the length of the DFT.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(float), tensor(float16), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain signal and output to float tensors.</dd>
<dt><tt>T2</tt> : tensor(int32), tensor(int64)</dt>
<dd>Constrain scalar length types to int64_t.</dd>
</dl>


#### Examples

<details>
<summary>stft</summary>

```python
signal = np.arange(0, 128, dtype=np.float32).reshape(1, 128, 1)
length = np.array(16).astype(np.int64)
onesided_length = (length >> 1) + 1
step = np.array(8).astype(np.int64)

no_window = ""  # optional input, not supplied
node = onnx.helper.make_node(
    "STFT",
    inputs=["signal", "frame_step", no_window, "frame_length"],
    outputs=["output"],
)

nstfts = ((signal.shape[1] - length) // step) + 1
# [batch_size][frames][frame_length][2]
output = np.empty([1, nstfts, onesided_length, 2], dtype=np.float32)
for i in range(nstfts):
    start = i * step
    stop = i * step + length
    complex_out = np.fft.fft(signal[0, start:stop, 0])[0:onesided_length]
    output[0, i] = np.stack((complex_out.real, complex_out.imag), axis=1)

expect(node, inputs=[signal, step, length], outputs=[output], name="test_stft")

node = onnx.helper.make_node(
    "STFT",
    inputs=["signal", "frame_step", "window"],
    outputs=["output"],
)

# Test with window
a0 = 0.5
a1 = 0.5
window = a0 + a1 * np.cos(
    2 * np.pi * np.arange(0, length, 1, dtype=np.float32) / length
)
nstfts = 1 + (signal.shape[1] - window.shape[0]) // step

# [batch_size][frames][frame_length][2]
output = np.empty([1, nstfts, onesided_length, 2], dtype=np.float32)
for i in range(nstfts):
    start = i * step
    stop = i * step + length
    complex_out = np.fft.fft(signal[0, start:stop, 0] * window)[
        0:onesided_length
    ]
    output[0, i] = np.stack((complex_out.real, complex_out.imag), axis=1)
expect(
    node,
    inputs=[signal, step, window],
    outputs=[output],
    name="test_stft_with_window",
)
```

</details>


### <a name="Scan"></a><a name="scan">**Scan**</a>

  Scan can be used to iterate over one or more scan_input tensors,
  constructing zero or more scan_output tensors. It combines ideas from general recurrences,
  functional programming constructs such as scan, fold, map, and zip, and is intended to enable
  generalizations of RNN-like constructs for sequence-to-sequence processing.
  Other tensors (referred to as state_variables here) can be used to carry a state
  when iterating from one element to another (similar to hidden-state in RNNs, also referred
  to as loop-carried dependences in the context of loops).
  Many common usages involve a single scan_input tensor (where functionality
  similar to scan, fold and map can be obtained). When more than one scan_input is used,
  a behavior similar to zip is obtained.

  The attribute body must be a graph, specifying the computation to be performed in
  every iteration. It takes as input the current values of the state_variables and
  the current iterated element of the scan_inputs. It must return the (updated) values
  of the state_variables and zero or more scan_output_element tensors. The values of the
  scan_output_element tensors are concatenated over all the iterations to produce the
  scan_output values of the scan construct (similar to the concatenated intermediate
  hidden-state values of RNN-like constructs). All the output tensors (state_variables as
  well as scan_output_element tensors) are required to have the same shape in each iteration
  of the loop (a restriction imposed to enable efficient memory allocation).

  Note that the iterated element passed to the body subgraph does not have a sequence
  axis. It will have a rank one less than the rank of the corresponding scan_input.

  The scan operation returns the final values of the state_variables as well as the
  scan_outputs.

  The optional attribute scan_input_directions specifies the direction (forward or backward)
  for each scan input. If this attribute is omitted, all sequences are scanned in the forward
  direction. A bidirectional scan may be performed by specifying the same tensor input twice
  in the scan_inputs, once with a forward direction, and once with a backward direction.

  The scan_output of the operation is produced by concatenating the scan_output_element
  values produced by the body in each iteration.  The optional attribute scan_output_directions
  specifies the direction in which scan_output is constructed (by appending or prepending the
  scan_output_element to scan_output in each iteration) for each scan_output. If this attribute
  is omitted, the scan_output_element is appended to the scan_output in each iteration.

  The optional attribute scan_input_axes specifies the axis to be scanned for each scan_input.
  If omitted, every scan_input will be scanned in axis 0. For example, if axis 0 is the
  batch axis and axis 1 is the time axis (to be scanned), specify an axis value of 1.
  Note that scanning a non-zero axis may be less efficient than scanning axis zero.

  The optional attribute scan_output_axes specifies the axis along which the scan_outputs
  are accumulated for each scan_output. For example, if axis 1 is the time axis (to be
  scanned) for both inputs and outputs, specify a scan_input axis and scan_output axis
  value of 1.

  Note that because of the ONNX restriction that only the last parameter of an operator can
  be variadic, the initial-states and scan-inputs are listed together as one input parameter.
  Similarly, the final-states and scan-outputs are listed together as one output parameter.
  The attribute num_scan_inputs indicates the number M of scan-inputs.

  The behavior of

      Scan <
          num_scan_inputs = m,
          body = loop-body,
          scan_input_axes = [axis_1, ..., axis_m]
      > (init_1, ..., init_n, scan_1, ..., scan_m)

  is equivalent to the following pseudo-code:

      // scan_i.shape[axis_i] denotes the (max) sequence-length of scan_i
      // scan_i.shape[axis_i] is required to be equal to scan_j.shape[axis_j] for all i,j.
      sequence_length = scan_1.shape[axis_1];

      // initialize state-variables
      st_1 = init_1; ... st_n = init_n;
      // initialize scan-output variables: [] denotes an empty tensor
      scan_out_1 = []; ...; scan_out_k = [];
      // identify number of iterations:

      // execute loop
      for (int t = 0; t < sequence_length; ++t) {
          // generate the scan-input elements: the notation T<axis=k>[t] indicates the sub-tensor
          // of rank one less than T obtained by indexing T at position t along axis k.
          si_1 = scan_1<axis=axis_1>[t];
          ... ;
          si_m = scan_m<axis=axis_m>[t];
          // execute loop-body
          st_1, ..., st_n, so_1, ..., so_k = loop-body(st_1, ..., st_n, si_1, ..., si_m)
          // accumulate the scan-output elements
          scan_out_1 = Concat<axis=0>(scan_out_1, so_1); ... ; scan_out_k = Concat<axis=0>(scan_out_k, so_k);
      }

      return st_1, ..., st_n, scan_out_1, ..., scan_out_k;

  *Sample usage: Encoding RNN using a Scan*

  The following example shows how a simple RNN over an input tensor %X, with weight tensor %Wi,
  recurrence weight tensor %Ri, bias tensors %Wbi and %Rbi, and initial hidden-state %H_0 can
  be encoded as a ScanLoop. Note that the loop-body is a nested graph, and it directly computes
  %Wi, %Ri, %Wbi, and %Rbi (typically constants or initializers in the body graph). If these
  values are computed in the outer graph, they need to be passed in as extra state_variables.

      graph rnn-encoding {
        %H_0 = ...
        %X = ...
        %Y_h, %Y = Scan[body = <graph rnn-cell-1>, num_scan_inputs=1](%H_0, %X)
        return %Y, %Y_h
      }

      graph rnn-cell-1 (
        %H_tminus1[FLOAT, tensor]
        %X_t[FLOAT, tensor]
      ) {
        %Wi = ...
        %Ri = ...
        %Wbi = ...
        %Rbi = ...
        %t1 = X_t * (Wi^T)
        %t2 = H_tminus1*(Ri^T)
        %t3 = Add(%t1, %t2)
        %t4 = Add(%t3, %Wbi)
        %t5 = Add(%t4, %Rbi)
        %Ht = Tanh(%t5)
        %Accumulate = Identity(%Ht)
        return %Ht, %Accumulate
      }


#### Version

This version of the operator has been available since version 21 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Scan-8">8</a>, <a href="Changelog.md#Scan-9">9</a>, <a href="Changelog.md#Scan-11">11</a>, <a href="Changelog.md#Scan-16">16</a>, <a href="Changelog.md#Scan-19">19</a>

#### Attributes

<dl>
<dt><tt>body</tt> : graph (required)</dt>
<dd>The graph run each iteration. It has N+M inputs: (loop state variables..., scan_input_elts...). It has N+K outputs: (loop state variables..., scan_output_elts...). Each scan_output is created by concatenating the value of the specified scan_output_elt value at the end of each iteration of the loop. It is an error if the dimensions of these values change across loop iterations.</dd>
<dt><tt>num_scan_inputs</tt> : int (required)</dt>
<dd>An attribute specifying the number of scan_inputs M. </dd>
<dt><tt>scan_input_axes</tt> : list of ints</dt>
<dd>An optional list of M flags. The i-th element of the list specifies the axis to be scanned (the sequence axis) for the i-th scan_input. If omitted, 0 will be used as the scan axis for every scan_input. Negative value for an axis means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(input).</dd>
<dt><tt>scan_input_directions</tt> : list of ints</dt>
<dd>An optional list of M flags. The i-th element of the list specifies the direction to be scanned for the i-th scan_input tensor: 0 indicates forward direction and 1 indicates reverse direction. If omitted, all scan_input tensors will be scanned in the forward direction.</dd>
<dt><tt>scan_output_axes</tt> : list of ints</dt>
<dd>An optional list of K flags. The i-th element of the list specifies the axis for the i-th scan_output. The scan outputs are accumulated along the specified axis. If omitted, 0 will be used as the scan axis for every scan_output. Negative value for an axis means counting dimensions from the back. Accepted range is [-r, r-1].</dd>
<dt><tt>scan_output_directions</tt> : list of ints</dt>
<dd>An optional list of K flags, one for each scan_output. The i-th element of the list specifies whether the i-th scan_output should be constructed by appending or prepending a new value in each iteration: 0 indicates appending and 1 indicates prepending. If omitted, all scan_output tensors will be produced by appending a value in each iteration.</dd>
</dl>

#### Inputs (1 - &#8734;)

<dl>
<dt><tt>initial_state_and_scan_inputs</tt> (variadic, heterogeneous) : V</dt>
<dd>Initial values of the loop's N state variables followed by M scan_inputs</dd>
</dl>

#### Outputs (1 - &#8734;)

<dl>
<dt><tt>final_state_and_scan_outputs</tt> (variadic, heterogeneous) : V</dt>
<dd>Final values of the loop's N state variables followed by K scan_outputs</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>V</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4)</dt>
<dd>All Tensor types up to IRv10.</dd>
</dl>


#### Examples

<details>
<summary>scan_8</summary>

```python
# Given an input sequence [x1, ..., xN], sum up its elements using a scan
# returning the final state (x1+x2+...+xN) as well the scan_output
# [x1, x1+x2, ..., x1+x2+...+xN]
#
# create graph to represent scan body
sum_in = onnx.helper.make_tensor_value_info(
    "sum_in", onnx.TensorProto.FLOAT, [2]
)
next = onnx.helper.make_tensor_value_info(  # noqa: A001
    "next", onnx.TensorProto.FLOAT, [2]
)
sum_out = onnx.helper.make_tensor_value_info(
    "sum_out", onnx.TensorProto.FLOAT, [2]
)
scan_out = onnx.helper.make_tensor_value_info(
    "scan_out", onnx.TensorProto.FLOAT, [2]
)
add_node = onnx.helper.make_node(
    "Add", inputs=["sum_in", "next"], outputs=["sum_out"]
)
id_node = onnx.helper.make_node(
    "Identity", inputs=["sum_out"], outputs=["scan_out"]
)
scan_body = onnx.helper.make_graph(
    [add_node, id_node], "scan_body", [sum_in, next], [sum_out, scan_out]
)
# create scan op node
no_sequence_lens = ""  # optional input, not supplied
node = onnx.helper.make_node(
    "Scan",
    inputs=[no_sequence_lens, "initial", "x"],
    outputs=["y", "z"],
    num_scan_inputs=1,
    body=scan_body,
)
# create inputs for batch-size 1, sequence-length 3, inner dimension 2
initial = np.array([0, 0]).astype(np.float32).reshape((1, 2))
x = np.array([1, 2, 3, 4, 5, 6]).astype(np.float32).reshape((1, 3, 2))
# final state computed = [1 + 3 + 5, 2 + 4 + 6]
y = np.array([9, 12]).astype(np.float32).reshape((1, 2))
# scan-output computed
z = np.array([1, 2, 4, 6, 9, 12]).astype(np.float32).reshape((1, 3, 2))

expect(
    node,
    inputs=[initial, x],
    outputs=[y, z],
    name="test_scan_sum",
    opset_imports=[onnx.helper.make_opsetid("", 8)],
)
```

</details>


<details>
<summary>scan_9</summary>

```python
# Given an input sequence [x1, ..., xN], sum up its elements using a scan
# returning the final state (x1+x2+...+xN) as well the scan_output
# [x1, x1+x2, ..., x1+x2+...+xN]
#
# create graph to represent scan body
sum_in = onnx.helper.make_tensor_value_info(
    "sum_in", onnx.TensorProto.FLOAT, [2]
)
next = onnx.helper.make_tensor_value_info(  # noqa: A001
    "next", onnx.TensorProto.FLOAT, [2]
)
sum_out = onnx.helper.make_tensor_value_info(
    "sum_out", onnx.TensorProto.FLOAT, [2]
)
scan_out = onnx.helper.make_tensor_value_info(
    "scan_out", onnx.TensorProto.FLOAT, [2]
)
add_node = onnx.helper.make_node(
    "Add", inputs=["sum_in", "next"], outputs=["sum_out"]
)
id_node = onnx.helper.make_node(
    "Identity", inputs=["sum_out"], outputs=["scan_out"]
)
scan_body = onnx.helper.make_graph(
    [add_node, id_node], "scan_body", [sum_in, next], [sum_out, scan_out]
)
# create scan op node
node = onnx.helper.make_node(
    "Scan",
    inputs=["initial", "x"],
    outputs=["y", "z"],
    num_scan_inputs=1,
    body=scan_body,
)
# create inputs for sequence-length 3, inner dimension 2
initial = np.array([0, 0]).astype(np.float32).reshape((2,))
x = np.array([1, 2, 3, 4, 5, 6]).astype(np.float32).reshape((3, 2))
# final state computed = [1 + 3 + 5, 2 + 4 + 6]
y = np.array([9, 12]).astype(np.float32).reshape((2,))
# scan-output computed
z = np.array([1, 2, 4, 6, 9, 12]).astype(np.float32).reshape((3, 2))

expect(
    node,
    inputs=[initial, x],
    outputs=[y, z],
    name="test_scan9_sum",
    opset_imports=[onnx.helper.make_opsetid("", 9)],
)
```

</details>


### <a name="Scatter"></a><a name="scatter">**Scatter** (deprecated)</a>

  This operator is deprecated. Please use ScatterElements, which provides the same functionality.

  Scatter takes three inputs `data`, `updates`, and `indices` of the same
  rank r >= 1 and an optional attribute axis that identifies an axis of `data`
  (by default, the outer-most axis, that is axis 0). The output of the operation
  is produced by creating a copy of the input `data`, and then updating its value
  to values specified by `updates` at specific index positions specified by
  `indices`. Its output shape is the same as the shape of `data`.

  For each entry in `updates`, the target index in `data` is obtained by combining
  the corresponding entry in `indices` with the index of the entry itself: the
  index-value for dimension = axis is obtained from the value of the corresponding
  entry in `indices` and the index-value for dimension != axis is obtained from the
  index of the entry itself.

  For instance, in a 2-D tensor case, the update corresponding to the [i][j] entry
  is performed as below:
  ```
    output[indices[i][j]][j] = updates[i][j] if axis = 0,
    output[i][indices[i][j]] = updates[i][j] if axis = 1,
  ```

  This operator is the inverse of GatherElements. It is similar to Torch's Scatter operation.

  Example 1:
  ```
    data = [
        [0.0, 0.0, 0.0],
        [0.0, 0.0, 0.0],
        [0.0, 0.0, 0.0],
    ]
    indices = [
        [1, 0, 2],
        [0, 2, 1],
    ]
    updates = [
        [1.0, 1.1, 1.2],
        [2.0, 2.1, 2.2],
    ]
    output = [
        [2.0, 1.1, 0.0]
        [1.0, 0.0, 2.2]
        [0.0, 2.1, 1.2]
    ]
  ```
  Example 2:
  ```
    data = [[1.0, 2.0, 3.0, 4.0, 5.0]]
    indices = [[1, 3]]
    updates = [[1.1, 2.1]]
    axis = 1
    output = [[1.0, 1.1, 3.0, 2.1, 5.0]]
  ```

#### Version

This version of the operator has been deprecated since version 11 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Scatter-9">9</a>


#### Examples

<details>
<summary>scatter_with_axis</summary>

```python
axis = 1
node = onnx.helper.make_node(
    "Scatter",
    inputs=["data", "indices", "updates"],
    outputs=["y"],
    axis=axis,
)
data = np.array([[1.0, 2.0, 3.0, 4.0, 5.0]], dtype=np.float32)
indices = np.array([[1, 3]], dtype=np.int64)
updates = np.array([[1.1, 2.1]], dtype=np.float32)

y = scatter(data, indices, updates, axis=axis)
# print(y) produces
# [[1.0, 1.1, 3.0, 2.1, 5.0]]

expect(
    node,
    inputs=[data, indices, updates],
    outputs=[y],
    name="test_scatter_with_axis",
    opset_imports=[helper.make_opsetid("", 10)],
)
```

</details>


<details>
<summary>scatter_without_axis</summary>

```python
node = onnx.helper.make_node(
    "Scatter",
    inputs=["data", "indices", "updates"],
    outputs=["y"],
)
data = np.zeros((3, 3), dtype=np.float32)
indices = np.array([[1, 0, 2], [0, 2, 1]], dtype=np.int64)
updates = np.array([[1.0, 1.1, 1.2], [2.0, 2.1, 2.2]], dtype=np.float32)

y = scatter(data, indices, updates)
# print(y) produces
# [[2.0, 1.1, 0.0],
#  [1.0, 0.0, 2.2],
#  [0.0, 2.1, 1.2]]

expect(
    node,
    inputs=[data, indices, updates],
    outputs=[y],
    name="test_scatter_without_axis",
    opset_imports=[helper.make_opsetid("", 10)],
)
```

</details>


### <a name="ScatterElements"></a><a name="scatterelements">**ScatterElements**</a>

  ScatterElements takes three inputs `data`, `updates`, and `indices` of the same
  rank r >= 1 and an optional attribute axis that identifies an axis of `data`
  (by default, the outer-most axis, that is axis 0). The output of the operation
  is produced by creating a copy of the input `data`, and then updating its value
  to values specified by `updates` at specific index positions specified by
  `indices`. Its output shape is the same as the shape of `data`.

  For each entry in `updates`, the target index in `data` is obtained by combining
  the corresponding entry in `indices` with the index of the entry itself: the
  index-value for dimension = axis is obtained from the value of the corresponding
  entry in `indices` and the index-value for dimension != axis is obtained from the
  index of the entry itself.

  `reduction` allows specification of an optional reduction operation, which is applied to all values in `updates`
  tensor into `output` at the specified `indices`.
  In cases where `reduction` is set to "none", indices should not have duplicate entries: that is, if idx1 != idx2,
  then indices[idx1] != indices[idx2]. For instance, in a 2-D tensor case, the update
  corresponding to the [i][j] entry is performed as below:
  ```
  output[indices[i][j]][j] = updates[i][j] if axis = 0,
  output[i][indices[i][j]] = updates[i][j] if axis = 1,
  ```
  When `reduction` is set to some reduction function `f`, the update corresponding to the [i][j] entry is performed as below:
  ```
  output[indices[i][j]][j] = f(output[indices[i][j]][j], updates[i][j]) if axis = 0,
  output[i][indices[i][j]] = f(output[i][indices[i][j]], updates[i][j]) if axis = 1,
  ```
  where the `f` is `+`, `*`, `max` or `min` as specified.

  This operator is the inverse of GatherElements. It is similar to Torch's Scatter operation.

  (Opset 18 change): Adds max/min to the set of allowed reduction ops.

  Example 1:
  ```
  data = [
      [0.0, 0.0, 0.0],
      [0.0, 0.0, 0.0],
      [0.0, 0.0, 0.0],
  ]
  indices = [
      [1, 0, 2],
      [0, 2, 1],
  ]
  updates = [
      [1.0, 1.1, 1.2],
      [2.0, 2.1, 2.2],
  ]
  output = [
      [2.0, 1.1, 0.0]
      [1.0, 0.0, 2.2]
      [0.0, 2.1, 1.2]
  ]
  ```
  Example 2:
  ```
  data = [[1.0, 2.0, 3.0, 4.0, 5.0]]
  indices = [[1, 3]]
  updates = [[1.1, 2.1]]
  axis = 1
  output = [[1.0, 1.1, 3.0, 2.1, 5.0]]
  ```

#### Version

This version of the operator has been available since version 18 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#ScatterElements-11">11</a>, <a href="Changelog.md#ScatterElements-13">13</a>, <a href="Changelog.md#ScatterElements-16">16</a>

#### Attributes

<dl>
<dt><tt>axis</tt> : int (default is 0)</dt>
<dd>Which axis to scatter on. Negative value means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(data).</dd>
<dt><tt>reduction</tt> : string (default is none)</dt>
<dd>Type of reduction to apply: none (default), add, mul, max, min. 'none': no reduction applied. 'add':  reduction using the addition operation. 'mul': reduction using the multiplication operation.'max': reduction using the maximum operation.'min': reduction using the minimum operation.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> (differentiable) : T</dt>
<dd>Tensor of rank r >= 1.</dd>
<dt><tt>indices</tt> (non-differentiable) : Tind</dt>
<dd>Tensor of int32/int64 indices, of r >= 1 (same rank as input). All index values are expected to be within bounds [-s, s-1] along axis of size s. It is an error if any of the index values are out of bounds.</dd>
<dt><tt>updates</tt> (differentiable) : T</dt>
<dd>Tensor of rank r >=1 (same rank and shape as indices)</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>Tensor of rank r >= 1 (same rank as input).</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Input and output types can be of any tensor type.</dd>
<dt><tt>Tind</tt> : tensor(int32), tensor(int64)</dt>
<dd>Constrain indices to integer types</dd>
</dl>


#### Examples

<details>
<summary>scatter_elements_with_axis</summary>

```python
axis = 1
node = onnx.helper.make_node(
    "ScatterElements",
    inputs=["data", "indices", "updates"],
    outputs=["y"],
    axis=axis,
)
data = np.array([[1.0, 2.0, 3.0, 4.0, 5.0]], dtype=np.float32)
indices = np.array([[1, 3]], dtype=np.int64)
updates = np.array([[1.1, 2.1]], dtype=np.float32)

y = scatter_elements(data, indices, updates, axis)
# print(y) produces
# [[1.0, 1.1, 3.0, 2.1, 5.0]]

expect(
    node,
    inputs=[data, indices, updates],
    outputs=[y],
    name="test_scatter_elements_with_axis",
)
```

</details>


<details>
<summary>scatter_elements_with_duplicate_indices</summary>

```python
axis = 1
node = onnx.helper.make_node(
    "ScatterElements",
    inputs=["data", "indices", "updates"],
    outputs=["y"],
    axis=axis,
    reduction="add",
)
data = np.array([[1.0, 2.0, 3.0, 4.0, 5.0]], dtype=np.float32)
indices = np.array([[1, 1]], dtype=np.int64)
updates = np.array([[1.1, 2.1]], dtype=np.float32)

y = scatter_elements(data, indices, updates, axis, reduction="add")
# print(y) produces
# [[1.0, 5.2, 3.0, 4.0, 5.0]]

expect(
    node,
    inputs=[data, indices, updates],
    outputs=[y],
    name="test_scatter_elements_with_duplicate_indices",
)
```

</details>


<details>
<summary>scatter_elements_with_negative_indices</summary>

```python
axis = 1
node = onnx.helper.make_node(
    "ScatterElements",
    inputs=["data", "indices", "updates"],
    outputs=["y"],
    axis=axis,
)
data = np.array([[1.0, 2.0, 3.0, 4.0, 5.0]], dtype=np.float32)
indices = np.array([[1, -3]], dtype=np.int64)
updates = np.array([[1.1, 2.1]], dtype=np.float32)

y = scatter_elements(data, indices, updates, axis)
# print(y) produces
# [[1.0, 1.1, 2.1, 4.0, 5.0]]

expect(
    node,
    inputs=[data, indices, updates],
    outputs=[y],
    name="test_scatter_elements_with_negative_indices",
)
```

</details>


<details>
<summary>scatter_elements_with_reduction_max</summary>

```python
axis = 1
node = onnx.helper.make_node(
    "ScatterElements",
    inputs=["data", "indices", "updates"],
    outputs=["y"],
    axis=axis,
    reduction="max",
)
data = np.array([[1.0, 2.0, 3.0, 4.0, 5.0]], dtype=np.float32)
indices = np.array([[1, 1]], dtype=np.int64)
updates = np.array([[1.1, 2.1]], dtype=np.float32)

y = scatter_elements(data, indices, updates, axis, reduction="max")
# print(y) produces
# [[1.0, 2.1, 3.0, 4.0, 5.0]]

expect(
    node,
    inputs=[data, indices, updates],
    outputs=[y],
    name="test_scatter_elements_with_reduction_max",
)
```

</details>


<details>
<summary>scatter_elements_with_reduction_min</summary>

```python
axis = 1
node = onnx.helper.make_node(
    "ScatterElements",
    inputs=["data", "indices", "updates"],
    outputs=["y"],
    axis=axis,
    reduction="min",
)
data = np.array([[1.0, 2.0, 3.0, 4.0, 5.0]], dtype=np.float32)
indices = np.array([[1, 1]], dtype=np.int64)
updates = np.array([[1.1, 2.1]], dtype=np.float32)

y = scatter_elements(data, indices, updates, axis, reduction="min")
# print(y) produces
# [[1.0, 1.1, 3.0, 4.0, 5.0]]

expect(
    node,
    inputs=[data, indices, updates],
    outputs=[y],
    name="test_scatter_elements_with_reduction_min",
)
```

</details>


<details>
<summary>scatter_elements_without_axis</summary>

```python
node = onnx.helper.make_node(
    "ScatterElements",
    inputs=["data", "indices", "updates"],
    outputs=["y"],
)
data = np.zeros((3, 3), dtype=np.float32)
indices = np.array([[1, 0, 2], [0, 2, 1]], dtype=np.int64)
updates = np.array([[1.0, 1.1, 1.2], [2.0, 2.1, 2.2]], dtype=np.float32)

y = scatter_elements(data, indices, updates)
# print(y) produces
# [[2.0, 1.1, 0.0],
#  [1.0, 0.0, 2.2],
#  [0.0, 2.1, 1.2]]

expect(
    node,
    inputs=[data, indices, updates],
    outputs=[y],
    name="test_scatter_elements_without_axis",
)
```

</details>


### <a name="ScatterND"></a><a name="scatternd">**ScatterND**</a>

  ScatterND takes three inputs `data` tensor of rank r >= 1, `indices` tensor of rank q >= 1,
  and `updates` tensor of rank q + r - indices.shape[-1] - 1. The output of the operation
  is produced by creating a copy of the input `data`, and then updating its value to values
  specified by `updates` at specific index positions specified by `indices`. Its output shape
  is the same as the shape of `data`.

  `indices` is an integer tensor. Let k denote indices.shape[-1], the last dimension in the shape of `indices`.
  `indices` is treated as a (q-1)-dimensional tensor of k-tuples, where each k-tuple is a partial-index into `data`.
  Hence, k can be a value at most the rank of `data`. When k equals rank(data), each update entry specifies an
  update to a single element of the tensor. When k is less than rank(data) each update entry specifies an
  update to a slice of the tensor. Index values are allowed to be negative, as per the usual
  convention for counting backwards from the end, but are expected in the valid range.

  `updates` is treated as a (q-1)-dimensional tensor of replacement-slice-values. Thus, the
  first (q-1) dimensions of updates.shape must match the first (q-1) dimensions of indices.shape.
  The remaining dimensions of `updates` correspond to the dimensions of the
  replacement-slice-values. Each replacement-slice-value is a (r-k) dimensional tensor,
  corresponding to the trailing (r-k) dimensions of `data`.  Thus, the shape of `updates`
  must equal indices.shape[0:q-1] ++ data.shape[k:r-1], where ++ denotes the concatenation
  of shapes.

  The `output` is calculated via the following equation:

  ```
  output = np.copy(data)
  update_indices = indices.shape[:-1]
  for idx in np.ndindex(update_indices):
      output[indices[idx]] = updates[idx]
  ```

  The order of iteration in the above loop is not specified.
  In particular, indices should not have duplicate entries: that is, if idx1 != idx2, then indices[idx1] != indices[idx2].
  This ensures that the output value does not depend on the iteration order.

  `reduction` allows specification of an optional reduction operation, which is applied to all values in `updates`
  tensor into `output` at the specified `indices`.
  In cases where `reduction` is set to "none", indices should not have duplicate entries: that is, if idx1 != idx2,
  then indices[idx1] != indices[idx2]. This ensures that the output value does not depend on the iteration order.
  When `reduction` is set to some reduction function `f`, `output` is calculated as follows:

  ```
  output = np.copy(data)
  update_indices = indices.shape[:-1]
  for idx in np.ndindex(update_indices):
      output[indices[idx]] = f(output[indices[idx]], updates[idx])
  ```

  where the `f` is `+`, `*`, `max` or `min` as specified.

  This operator is the inverse of GatherND.

  (Opset 18 change): Adds max/min to the set of allowed reduction ops.

  Example 1:
  ```
  data    = [1, 2, 3, 4, 5, 6, 7, 8]
  indices = [[4], [3], [1], [7]]
  updates = [9, 10, 11, 12]
  output  = [1, 11, 3, 10, 9, 6, 7, 12]
  ```

  Example 2:
  ```
  data    = [[[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
              [[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
              [[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]],
              [[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]]]
  indices = [[0], [2]]
  updates = [[[5, 5, 5, 5], [6, 6, 6, 6], [7, 7, 7, 7], [8, 8, 8, 8]],
              [[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]]]
  output  = [[[5, 5, 5, 5], [6, 6, 6, 6], [7, 7, 7, 7], [8, 8, 8, 8]],
              [[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
              [[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]],
              [[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]]]
  ```

#### Version

This version of the operator has been available since version 18 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#ScatterND-11">11</a>, <a href="Changelog.md#ScatterND-13">13</a>, <a href="Changelog.md#ScatterND-16">16</a>

#### Attributes

<dl>
<dt><tt>reduction</tt> : string (default is none)</dt>
<dd>Type of reduction to apply: none (default), add, mul, max, min. 'none': no reduction applied. 'add':  reduction using the addition operation. 'mul':  reduction using the addition operation. 'max': reduction using the maximum operation.'min': reduction using the minimum operation.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> (differentiable) : T</dt>
<dd>Tensor of rank r >= 1.</dd>
<dt><tt>indices</tt> (non-differentiable) : tensor(int64)</dt>
<dd>Tensor of rank q >= 1.</dd>
<dt><tt>updates</tt> (differentiable) : T</dt>
<dd>Tensor of rank q + r - indices_shape[-1] - 1.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>Tensor of rank r >= 1.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain input and output types to any tensor type.</dd>
</dl>


#### Examples

<details>
<summary>scatternd</summary>

```python
node = onnx.helper.make_node(
    "ScatterND",
    inputs=["data", "indices", "updates"],
    outputs=["y"],
)
data = np.array(
    [
        [[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
        [[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
        [[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]],
        [[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]],
    ],
    dtype=np.float32,
)
indices = np.array([[0], [2]], dtype=np.int64)
updates = np.array(
    [
        [[5, 5, 5, 5], [6, 6, 6, 6], [7, 7, 7, 7], [8, 8, 8, 8]],
        [[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]],
    ],
    dtype=np.float32,
)
# Expecting output as np.array(
#    [[[5, 5, 5, 5], [6, 6, 6, 6], [7, 7, 7, 7], [8, 8, 8, 8]],
#     [[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
#     [[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]],
#     [[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]]], dtype=np.float32)
output = scatter_nd_impl(data, indices, updates)
expect(
    node,
    inputs=[data, indices, updates],
    outputs=[output],
    name="test_scatternd",
)
```

</details>


<details>
<summary>scatternd_add</summary>

```python
node = onnx.helper.make_node(
    "ScatterND",
    inputs=["data", "indices", "updates"],
    outputs=["y"],
    reduction="add",
)
data = np.array(
    [
        [[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
        [[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
        [[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]],
        [[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]],
    ],
    dtype=np.float32,
)
indices = np.array([[0], [0]], dtype=np.int64)
updates = np.array(
    [
        [[5, 5, 5, 5], [6, 6, 6, 6], [7, 7, 7, 7], [8, 8, 8, 8]],
        [[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]],
    ],
    dtype=np.float32,
)
# Expecting output as np.array(
#    [[[7, 8, 9, 10], [13, 14, 15, 16], [18, 17, 16, 15], [16, 15, 14, 13]],
#     [[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
#     [[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]],
#     [[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]]], dtype=np.float32)
output = scatter_nd_impl(data, indices, updates, reduction="add")
expect(
    node,
    inputs=[data, indices, updates],
    outputs=[output],
    name="test_scatternd_add",
)
```

</details>


<details>
<summary>scatternd_max</summary>

```python
node = onnx.helper.make_node(
    "ScatterND",
    inputs=["data", "indices", "updates"],
    outputs=["y"],
    reduction="max",
)
data = np.array(
    [
        [[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
        [[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
        [[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]],
        [[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]],
    ],
    dtype=np.float32,
)
indices = np.array([[0], [0]], dtype=np.int64)
updates = np.array(
    [
        [[5, 5, 5, 5], [6, 6, 6, 6], [7, 7, 7, 7], [8, 8, 8, 8]],
        [[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]],
    ],
    dtype=np.float32,
)
# Expecting output as np.array(
#    [[[5, 5, 5, 5], [6, 6, 7, 8], [8, 7, 7, 7], [8, 8 ,8, 8]],
#     [[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
#     [[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]],
#     [[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]]], dtype=np.float32)
output = scatter_nd_impl(data, indices, updates, reduction="max")
expect(
    node,
    inputs=[data, indices, updates],
    outputs=[output],
    name="test_scatternd_max",
)
```

</details>


<details>
<summary>scatternd_min</summary>

```python
node = onnx.helper.make_node(
    "ScatterND",
    inputs=["data", "indices", "updates"],
    outputs=["y"],
    reduction="min",
)
data = np.array(
    [
        [[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
        [[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
        [[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]],
        [[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]],
    ],
    dtype=np.float32,
)
indices = np.array([[0], [0]], dtype=np.int64)
updates = np.array(
    [
        [[5, 5, 5, 5], [6, 6, 6, 6], [7, 7, 7, 7], [8, 8, 8, 8]],
        [[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]],
    ],
    dtype=np.float32,
)
# Expecting output as np.array(
#    [[[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 3, 2, 1]],
#     [[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
#     [[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]],
#     [[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]]], dtype=np.float32)
output = scatter_nd_impl(data, indices, updates, reduction="min")
expect(
    node,
    inputs=[data, indices, updates],
    outputs=[output],
    name="test_scatternd_min",
)
```

</details>


<details>
<summary>scatternd_multiply</summary>

```python
node = onnx.helper.make_node(
    "ScatterND",
    inputs=["data", "indices", "updates"],
    outputs=["y"],
    reduction="mul",
)
data = np.array(
    [
        [[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
        [[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
        [[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]],
        [[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]],
    ],
    dtype=np.float32,
)
indices = np.array([[0], [0]], dtype=np.int64)
updates = np.array(
    [
        [[5, 5, 5, 5], [6, 6, 6, 6], [7, 7, 7, 7], [8, 8, 8, 8]],
        [[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]],
    ],
    dtype=np.float32,
)
# Expecting output as np.array(
#    [[[5, 10, 15, 20], [60, 72, 84, 96], [168, 147, 126, 105], [128, 96, 64, 32]],
#     [[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]],
#     [[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]],
#     [[8, 7, 6, 5], [4, 3, 2, 1], [1, 2, 3, 4], [5, 6, 7, 8]]], dtype=np.float32)
output = scatter_nd_impl(data, indices, updates, reduction="mul")
expect(
    node,
    inputs=[data, indices, updates],
    outputs=[output],
    name="test_scatternd_multiply",
)
```

</details>


### <a name="Selu"></a><a name="selu">**Selu**</a>

  Selu takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the scaled exponential linear unit function,
  `y = gamma * (alpha * e^x - alpha) for x <= 0`, `y = gamma * x for x > 0`,
  is applied to the tensor elementwise.

#### Version

This version of the operator has been available since version 6 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Selu-1">1</a>

#### Attributes

<dl>
<dt><tt>alpha</tt> : float (default is 1.67326)</dt>
<dd>Coefficient of SELU default to 1.67326319217681884765625 (i.e., float32 approximation of 1.6732632423543772848170429916717).</dd>
<dt><tt>gamma</tt> : float (default is 1.0507)</dt>
<dd>Coefficient of SELU default to 1.05070102214813232421875 (i.e., float32 approximation of 1.0507009873554804934193349852946).</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>selu</summary>

```python
node = onnx.helper.make_node(
    "Selu", inputs=["x"], outputs=["y"], alpha=2.0, gamma=3.0
)

x = np.array([-1, 0, 1]).astype(np.float32)
# expected output [-3.79272318, 0., 3.]
y = (
    np.clip(x, 0, np.inf) * 3.0
    + (np.exp(np.clip(x, -np.inf, 0)) - 1) * 2.0 * 3.0
)
expect(node, inputs=[x], outputs=[y], name="test_selu_example")

x = np.random.randn(3, 4, 5).astype(np.float32)
y = (
    np.clip(x, 0, np.inf) * 3.0
    + (np.exp(np.clip(x, -np.inf, 0)) - 1) * 2.0 * 3.0
)
expect(node, inputs=[x], outputs=[y], name="test_selu")
```

</details>


<details>
<summary>selu_default</summary>

```python
default_alpha = 1.67326319217681884765625
default_gamma = 1.05070102214813232421875
node = onnx.helper.make_node(
    "Selu",
    inputs=["x"],
    outputs=["y"],
)
x = np.random.randn(3, 4, 5).astype(np.float32)
y = (
    np.clip(x, 0, np.inf) * default_gamma
    + (np.exp(np.clip(x, -np.inf, 0)) - 1) * default_alpha * default_gamma
)
expect(node, inputs=[x], outputs=[y], name="test_selu_default")
```

</details>


### <a name="SequenceAt"></a><a name="sequenceat">**SequenceAt**</a>

  Outputs a tensor copy from the tensor at 'position' in 'input_sequence'.
  Accepted range for 'position' is in `[-n, n - 1]`, where `n` is the number of tensors in 'input_sequence'.
  Negative value means counting positions from the back.

#### Version

This version of the operator has been available since version 11 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>input_sequence</tt> : S</dt>
<dd>Input sequence.</dd>
<dt><tt>position</tt> : I</dt>
<dd>Position of the tensor in the sequence. Negative value means counting positions from the back. Accepted range in `[-n, n - 1]`, where `n` is the number of tensors in 'input_sequence'. It is an error if any of the index values are out of bounds. It must be a scalar(tensor of empty shape).</dd>
</dl>

#### Outputs

<dl>
<dt><tt>tensor</tt> : T</dt>
<dd>Output tensor at the specified position in the input sequence.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>S</tt> : seq(tensor(uint8)), seq(tensor(uint16)), seq(tensor(uint32)), seq(tensor(uint64)), seq(tensor(int8)), seq(tensor(int16)), seq(tensor(int32)), seq(tensor(int64)), seq(tensor(float16)), seq(tensor(float)), seq(tensor(double)), seq(tensor(string)), seq(tensor(bool)), seq(tensor(complex64)), seq(tensor(complex128))</dt>
<dd>Constrain to any tensor type.</dd>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain to any tensor type.</dd>
<dt><tt>I</tt> : tensor(int32), tensor(int64)</dt>
<dd>Constrain position to integral tensor. It must be a scalar(tensor of empty shape).</dd>
</dl>


### <a name="SequenceConstruct"></a><a name="sequenceconstruct">**SequenceConstruct**</a>

  Construct a tensor sequence containing 'inputs' tensors.
  All tensors in 'inputs' must have the same data type.

#### Version

This version of the operator has been available since version 11 of the default ONNX operator set.

#### Inputs (1 - &#8734;)

<dl>
<dt><tt>inputs</tt> (variadic) : T</dt>
<dd>Tensors.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output_sequence</tt> : S</dt>
<dd>Sequence enclosing the input tensors.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain input types to any tensor type.</dd>
<dt><tt>S</tt> : seq(tensor(uint8)), seq(tensor(uint16)), seq(tensor(uint32)), seq(tensor(uint64)), seq(tensor(int8)), seq(tensor(int16)), seq(tensor(int32)), seq(tensor(int64)), seq(tensor(float16)), seq(tensor(float)), seq(tensor(double)), seq(tensor(string)), seq(tensor(bool)), seq(tensor(complex64)), seq(tensor(complex128))</dt>
<dd>Constrain output types to any tensor type.</dd>
</dl>


### <a name="SequenceEmpty"></a><a name="sequenceempty">**SequenceEmpty**</a>

  Construct an empty tensor sequence, with given data type.

#### Version

This version of the operator has been available since version 11 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>dtype</tt> : int</dt>
<dd>(Optional) The data type of the tensors in the output sequence. The default type is 'float'.</dd>
</dl>

#### Inputs


#### Outputs

<dl>
<dt><tt>output</tt> : S</dt>
<dd>Empty sequence.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>S</tt> : seq(tensor(uint8)), seq(tensor(uint16)), seq(tensor(uint32)), seq(tensor(uint64)), seq(tensor(int8)), seq(tensor(int16)), seq(tensor(int32)), seq(tensor(int64)), seq(tensor(float16)), seq(tensor(float)), seq(tensor(double)), seq(tensor(string)), seq(tensor(bool)), seq(tensor(complex64)), seq(tensor(complex128))</dt>
<dd>Constrain output types to any tensor type.</dd>
</dl>


### <a name="SequenceErase"></a><a name="sequenceerase">**SequenceErase**</a>

  Outputs a tensor sequence that removes the tensor at 'position' from 'input_sequence'.
  Accepted range for 'position' is in `[-n, n - 1]`, where `n` is the number of tensors in 'input_sequence'.
  Negative value means counting positions from the back.
  'position' is optional, by default it erases the last tensor from 'input_sequence'.

#### Version

This version of the operator has been available since version 11 of the default ONNX operator set.

#### Inputs (1 - 2)

<dl>
<dt><tt>input_sequence</tt> : S</dt>
<dd>Input sequence.</dd>
<dt><tt>position</tt> (optional) : I</dt>
<dd>Position of the tensor in the sequence. Negative value means counting positions from the back. Accepted range in `[-n, n - 1]`, where `n` is the number of tensors in 'input_sequence'. It is an error if any of the index values are out of bounds. It must be a scalar(tensor of empty shape).</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output_sequence</tt> : S</dt>
<dd>Output sequence that has the tensor at the specified position removed.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>S</tt> : seq(tensor(uint8)), seq(tensor(uint16)), seq(tensor(uint32)), seq(tensor(uint64)), seq(tensor(int8)), seq(tensor(int16)), seq(tensor(int32)), seq(tensor(int64)), seq(tensor(float16)), seq(tensor(float)), seq(tensor(double)), seq(tensor(string)), seq(tensor(bool)), seq(tensor(complex64)), seq(tensor(complex128))</dt>
<dd>Constrain to any tensor type.</dd>
<dt><tt>I</tt> : tensor(int32), tensor(int64)</dt>
<dd>Constrain position to integral tensor. It must be a scalar(tensor of empty shape).</dd>
</dl>


### <a name="SequenceInsert"></a><a name="sequenceinsert">**SequenceInsert**</a>

  Outputs a tensor sequence that inserts 'tensor' into 'input_sequence' at 'position'.
  'tensor' must have the same data type as 'input_sequence'.
  Accepted range for 'position' is in `[-n, n]`, where `n` is the number of tensors in 'input_sequence'.
  Negative value means counting positions from the back.
  'position' is optional, by default it inserts 'tensor' to the back of 'input_sequence'.

#### Version

This version of the operator has been available since version 11 of the default ONNX operator set.

#### Inputs (2 - 3)

<dl>
<dt><tt>input_sequence</tt> : S</dt>
<dd>Input sequence.</dd>
<dt><tt>tensor</tt> : T</dt>
<dd>Input tensor to be inserted into the input sequence.</dd>
<dt><tt>position</tt> (optional) : I</dt>
<dd>Position in the sequence where the new tensor is inserted. It is optional and default is to insert to the back of the sequence. Negative value means counting positions from the back. Accepted range in `[-n, n]`, where `n` is the number of tensors in 'input_sequence'. It is an error if any of the index values are out of bounds. It must be a scalar(tensor of empty shape).</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output_sequence</tt> : S</dt>
<dd>Output sequence that contains the inserted tensor at given position.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain to any tensor type.</dd>
<dt><tt>S</tt> : seq(tensor(uint8)), seq(tensor(uint16)), seq(tensor(uint32)), seq(tensor(uint64)), seq(tensor(int8)), seq(tensor(int16)), seq(tensor(int32)), seq(tensor(int64)), seq(tensor(float16)), seq(tensor(float)), seq(tensor(double)), seq(tensor(string)), seq(tensor(bool)), seq(tensor(complex64)), seq(tensor(complex128))</dt>
<dd>Constrain to any tensor type.</dd>
<dt><tt>I</tt> : tensor(int32), tensor(int64)</dt>
<dd>Constrain position to integral tensor. It must be a scalar(tensor of empty shape).</dd>
</dl>


#### Examples

<details>
<summary>sequenceinsert</summary>

```python
test_cases = {
    "at_back": [np.array([10, 11, 12]).astype(np.int64)],
    "at_front": [np.array([-2, -1, 0]), np.array([0]).astype(np.int64)],
}
sequence = [
    np.array([1, 2, 3, 4]).astype(np.int64),
    np.array([5, 6, 7]).astype(np.int64),
    np.array([8, 9]).astype(np.int64),
]

for test_name, test_inputs in test_cases.items():
    tensor = test_inputs[0].astype(np.int64)

    if len(test_inputs) > 1:
        node = onnx.helper.make_node(
            "SequenceInsert",
            inputs=["sequence", "tensor", "position"],
            outputs=["output_sequence"],
        )
        position = test_inputs[1]
        inserted = sequence_insert_reference_implementation(
            sequence, tensor, position
        )
        expect(
            node,
            inputs=[sequence, tensor, position],
            outputs=[inserted],
            name="test_sequence_insert_" + test_name,
        )
    else:
        node = onnx.helper.make_node(
            "SequenceInsert",
            inputs=["sequence", "tensor"],
            outputs=["output_sequence"],
        )
        inserted = sequence_insert_reference_implementation(sequence, tensor)
        expect(
            node,
            inputs=[sequence, tensor],
            outputs=[inserted],
            name="test_sequence_insert_" + test_name,
        )
```

</details>


### <a name="SequenceLength"></a><a name="sequencelength">**SequenceLength**</a>

  Produces a scalar(tensor of empty shape) containing the number of tensors in 'input_sequence'.

#### Version

This version of the operator has been available since version 11 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>input_sequence</tt> : S</dt>
<dd>Input sequence.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>length</tt> : I</dt>
<dd>Length of input sequence. It must be a scalar(tensor of empty shape).</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>S</tt> : seq(tensor(uint8)), seq(tensor(uint16)), seq(tensor(uint32)), seq(tensor(uint64)), seq(tensor(int8)), seq(tensor(int16)), seq(tensor(int32)), seq(tensor(int64)), seq(tensor(float16)), seq(tensor(float)), seq(tensor(double)), seq(tensor(string)), seq(tensor(bool)), seq(tensor(complex64)), seq(tensor(complex128))</dt>
<dd>Constrain to any tensor type.</dd>
<dt><tt>I</tt> : tensor(int64)</dt>
<dd>Constrain output to integral tensor. It must be a scalar(tensor of empty shape).</dd>
</dl>


### <a name="SequenceMap"></a><a name="sequencemap">**SequenceMap**</a>

  Applies a sub-graph to each sample in the input sequence(s).

  Inputs can be either tensors or sequences, with the exception of the first input which must
  be a sequence. The length of the first input sequence will determine the number of samples in the
  outputs. Any other sequence inputs should have the same number of samples. The number of inputs
  and outputs, should match the one of the subgraph.

  For each i-th element in the output, a sample will be extracted from the input sequence(s) at
  the i-th position and the sub-graph will be applied to it.
  The outputs will contain the outputs of the sub-graph for each sample, in the same order as in
  the input.

  This operator assumes that processing each sample is independent and could executed in parallel
  or in any order. Users cannot expect any specific ordering in which each subgraph is computed.

#### Version

This version of the operator has been available since version 17 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>body</tt> : graph (required)</dt>
<dd>The graph to be run for each sample in the sequence(s). It should have as many inputs and outputs as inputs and outputs to the SequenceMap function.</dd>
</dl>

#### Inputs (1 - &#8734;)

<dl>
<dt><tt>input_sequence</tt> : S</dt>
<dd>Input sequence.</dd>
<dt><tt>additional_inputs</tt> (variadic, heterogeneous) : V</dt>
<dd>Additional inputs to the graph</dd>
</dl>

#### Outputs (1 - &#8734;)

<dl>
<dt><tt>out_sequence</tt> (variadic, heterogeneous) : S</dt>
<dd>Output sequence(s)</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>S</tt> : seq(tensor(uint8)), seq(tensor(uint16)), seq(tensor(uint32)), seq(tensor(uint64)), seq(tensor(int8)), seq(tensor(int16)), seq(tensor(int32)), seq(tensor(int64)), seq(tensor(float16)), seq(tensor(float)), seq(tensor(double)), seq(tensor(string)), seq(tensor(bool)), seq(tensor(complex64)), seq(tensor(complex128))</dt>
<dd>Constrain input types to any sequence type.</dd>
<dt><tt>V</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128), seq(tensor(uint8)), seq(tensor(uint16)), seq(tensor(uint32)), seq(tensor(uint64)), seq(tensor(int8)), seq(tensor(int16)), seq(tensor(int32)), seq(tensor(int64)), seq(tensor(float16)), seq(tensor(float)), seq(tensor(double)), seq(tensor(string)), seq(tensor(bool)), seq(tensor(complex64)), seq(tensor(complex128))</dt>
<dd>Constrain to any tensor or sequence type.</dd>
</dl>


#### Examples

<details>
<summary>sequence_map_add_1_sequence_1_tensor</summary>

```python
body = onnx.helper.make_graph(
    [onnx.helper.make_node("Add", ["in0", "in1"], ["out0"])],
    "seq_map_body",
    [
        onnx.helper.make_tensor_value_info(
            "in0", onnx.TensorProto.FLOAT, ["N"]
        ),
        onnx.helper.make_tensor_value_info(
            "in1", onnx.TensorProto.FLOAT, ["N"]
        ),
    ],
    [onnx.helper.make_tensor_value_info("out0", onnx.TensorProto.FLOAT, ["N"])],
)

node = onnx.helper.make_node(
    "SequenceMap", inputs=["x0", "x1"], outputs=["y0"], body=body
)

x0 = [np.random.uniform(0.0, 1.0, 10).astype(np.float32) for k in range(3)]
x1 = np.random.uniform(0.0, 1.0, 10).astype(np.float32)
y0 = [x0[i] + x1 for i in range(3)]
input_type_protos = [
    onnx.helper.make_sequence_type_proto(
        onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["N"])
    ),
    onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["N"]),
]
output_type_protos = [
    onnx.helper.make_sequence_type_proto(
        onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["N"])
    ),
]
expect(
    node,
    inputs=[x0, x1],
    outputs=[y0],
    input_type_protos=input_type_protos,
    output_type_protos=output_type_protos,
    name="test_sequence_map_add_1_sequence_1_tensor",
)
```

</details>


<details>
<summary>sequence_map_add_2_sequences</summary>

```python
body = onnx.helper.make_graph(
    [onnx.helper.make_node("Add", ["in0", "in1"], ["out0"])],
    "seq_map_body",
    [
        onnx.helper.make_tensor_value_info(
            "in0", onnx.TensorProto.FLOAT, ["N"]
        ),
        onnx.helper.make_tensor_value_info(
            "in1", onnx.TensorProto.FLOAT, ["N"]
        ),
    ],
    [onnx.helper.make_tensor_value_info("out0", onnx.TensorProto.FLOAT, ["N"])],
)

node = onnx.helper.make_node(
    "SequenceMap", inputs=["x0", "x1"], outputs=["y0"], body=body
)

N = [np.random.randint(1, 10) for _ in range(3)]
x0 = [np.random.uniform(0.0, 1.0, N[k]).astype(np.float32) for k in range(3)]
x1 = [np.random.uniform(0.0, 1.0, N[k]).astype(np.float32) for k in range(3)]
y0 = [x0[k] + x1[k] for k in range(3)]
input_type_protos = [
    onnx.helper.make_sequence_type_proto(
        onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["N"])
    ),
    onnx.helper.make_sequence_type_proto(
        onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["N"])
    ),
]
output_type_protos = [
    onnx.helper.make_sequence_type_proto(
        onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["N"])
    ),
]
expect(
    node,
    inputs=[x0, x1],
    outputs=[y0],
    input_type_protos=input_type_protos,
    output_type_protos=output_type_protos,
    name="test_sequence_map_add_2_sequences",
)
```

</details>


<details>
<summary>sequence_map_extract_shapes</summary>

```python
body = onnx.helper.make_graph(
    [onnx.helper.make_node("Shape", ["x"], ["shape"])],
    "seq_map_body",
    [
        onnx.helper.make_tensor_value_info(
            "x", onnx.TensorProto.FLOAT, ["H", "W", "C"]
        )
    ],
    [onnx.helper.make_tensor_value_info("shape", onnx.TensorProto.INT64, [3])],
)

node = onnx.helper.make_node(
    "SequenceMap", inputs=["in_seq"], outputs=["shapes"], body=body
)

shapes = [
    np.array([40, 30, 3], dtype=np.int64),
    np.array([20, 10, 3], dtype=np.int64),
    np.array([10, 5, 3], dtype=np.int64),
]
x0 = [np.zeros(shape, dtype=np.float32) for shape in shapes]
input_type_protos = [
    onnx.helper.make_sequence_type_proto(
        onnx.helper.make_tensor_type_proto(
            onnx.TensorProto.FLOAT, ["H", "W", "C"]
        )
    ),
]
output_type_protos = [
    onnx.helper.make_sequence_type_proto(
        onnx.helper.make_tensor_type_proto(onnx.TensorProto.INT64, [3])
    ),
]
expect(
    node,
    inputs=[x0],
    outputs=[shapes],
    input_type_protos=input_type_protos,
    output_type_protos=output_type_protos,
    name="test_sequence_map_extract_shapes",
)
```

</details>


<details>
<summary>sequence_map_identity_1_sequence</summary>

```python
body = onnx.helper.make_graph(
    [onnx.helper.make_node("Identity", ["in0"], ["out0"])],
    "seq_map_body",
    [onnx.helper.make_tensor_value_info("in0", onnx.TensorProto.FLOAT, ["N"])],
    [onnx.helper.make_tensor_value_info("out0", onnx.TensorProto.FLOAT, ["M"])],
)

node = onnx.helper.make_node(
    "SequenceMap", inputs=["x"], outputs=["y"], body=body
)

x = [np.random.uniform(0.0, 1.0, 10).astype(np.float32) for _ in range(3)]
y = x
input_type_protos = [
    onnx.helper.make_sequence_type_proto(
        onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["N"])
    ),
]
output_type_protos = [
    onnx.helper.make_sequence_type_proto(
        onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["N"])
    ),
]
expect(
    node,
    inputs=[x],
    outputs=[y],
    input_type_protos=input_type_protos,
    output_type_protos=output_type_protos,
    name="test_sequence_map_identity_1_sequence",
)
```

</details>


<details>
<summary>sequence_map_identity_1_sequence_1_tensor</summary>

```python
body = onnx.helper.make_graph(
    [
        onnx.helper.make_node("Identity", ["in0"], ["out0"]),
        onnx.helper.make_node("Identity", ["in1"], ["out1"]),
    ],
    "seq_map_body",
    [
        onnx.helper.make_tensor_value_info(
            "in0", onnx.TensorProto.FLOAT, ["N"]
        ),
        onnx.helper.make_tensor_value_info(
            "in1", onnx.TensorProto.FLOAT, ["M"]
        ),
    ],
    [
        onnx.helper.make_tensor_value_info(
            "out0", onnx.TensorProto.FLOAT, ["N"]
        ),
        onnx.helper.make_tensor_value_info(
            "out1", onnx.TensorProto.FLOAT, ["M"]
        ),
    ],
)

node = onnx.helper.make_node(
    "SequenceMap", inputs=["x0", "x1"], outputs=["y0", "y1"], body=body
)

x0 = [
    np.random.uniform(0.0, 1.0, np.random.randint(1, 10)).astype(np.float32)
    for _ in range(3)
]
x1 = np.random.uniform(0.0, 1.0, np.random.randint(1, 10)).astype(np.float32)
y0 = x0
y1 = [x1 for _ in range(3)]
input_type_protos = [
    onnx.helper.make_sequence_type_proto(
        onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["N"])
    ),
    onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["M"]),
]
output_type_protos = [
    onnx.helper.make_sequence_type_proto(
        onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["N"])
    ),
    onnx.helper.make_sequence_type_proto(
        onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["M"])
    ),
]
expect(
    node,
    inputs=[x0, x1],
    outputs=[y0, y1],
    input_type_protos=input_type_protos,
    output_type_protos=output_type_protos,
    name="test_sequence_map_identity_1_sequence_1_tensor",
)
```

</details>


<details>
<summary>sequence_map_identity_2_sequences</summary>

```python
body = onnx.helper.make_graph(
    [
        onnx.helper.make_node("Identity", ["in0"], ["out0"]),
        onnx.helper.make_node("Identity", ["in1"], ["out1"]),
    ],
    "seq_map_body",
    [
        onnx.helper.make_tensor_value_info(
            "in0", onnx.TensorProto.FLOAT, ["N"]
        ),
        onnx.helper.make_tensor_value_info(
            "in1", onnx.TensorProto.FLOAT, ["M"]
        ),
    ],
    [
        onnx.helper.make_tensor_value_info(
            "out0", onnx.TensorProto.FLOAT, ["N"]
        ),
        onnx.helper.make_tensor_value_info(
            "out1", onnx.TensorProto.FLOAT, ["M"]
        ),
    ],
)

node = onnx.helper.make_node(
    "SequenceMap", inputs=["x0", "x1"], outputs=["y0", "y1"], body=body
)

x0 = [
    np.random.uniform(0.0, 1.0, np.random.randint(1, 10)).astype(np.float32)
    for _ in range(3)
]
x1 = [
    np.random.uniform(0.0, 1.0, np.random.randint(1, 10)).astype(np.float32)
    for _ in range(3)
]
y0 = x0
y1 = x1
input_type_protos = [
    onnx.helper.make_sequence_type_proto(
        onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["N"])
    ),
    onnx.helper.make_sequence_type_proto(
        onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["M"])
    ),
]
output_type_protos = [
    onnx.helper.make_sequence_type_proto(
        onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["N"])
    ),
    onnx.helper.make_sequence_type_proto(
        onnx.helper.make_tensor_type_proto(onnx.TensorProto.FLOAT, ["M"])
    ),
]
expect(
    node,
    inputs=[x0, x1],
    outputs=[y0, y1],
    input_type_protos=input_type_protos,
    output_type_protos=output_type_protos,
    name="test_sequence_map_identity_2_sequences",
)
```

</details>


### <a name="Shape"></a><a name="shape">**Shape**</a>

  Takes a tensor as input and outputs an 1D int64 tensor containing the shape of the input tensor.
  Optional attributes start and end can be used to compute a slice of the input tensor's shape.
  If start axis is omitted, the slice starts from axis 0.
  The end axis, if specified, is exclusive (and the returned value will not include the size of that axis).
  If the end axis is omitted, the axes upto the last one will be included.
  Negative axes indicate counting back from the last axis.
  Note that axes will be clamped to the range [0, r-1], where r is the
  rank of the input tensor if they are out-of-range (after adding r in the case of
  negative axis). Thus, specifying any end value > r is equivalent to specifying an end
  value of r, and specifying any start value < -r is equivalent to specifying a start
  value of 0.

  Examples:

  ```
  Input tensor with shape: [2, 3, 4]
  No attributes specified.
  Output: [2, 3, 4]
  ```

  ```
  Input tensor with shape: [2, 3, 4]
  start: -1
  Output: [4]
  ```

  ```
  Input tensor with shape: [2, 3, 4]
  end: -1
  Output: [2, 3]
  ```

  ```
  Input tensor with shape: [2, 3, 4]
  start: 1
  end: 2
  Output: [3]
  ```

#### Version

This version of the operator has been available since version 21 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Shape-1">1</a>, <a href="Changelog.md#Shape-13">13</a>, <a href="Changelog.md#Shape-15">15</a>, <a href="Changelog.md#Shape-19">19</a>

#### Attributes

<dl>
<dt><tt>end</tt> : int</dt>
<dd>(Optional) Ending axis for slicing the shape. Negative value means counting dimensions from the back. If omitted, sizes of all axes upto (including) the last one will be included.</dd>
<dt><tt>start</tt> : int (default is 0)</dt>
<dd>(Optional) Starting axis for slicing the shape. Default value is 0.Negative value means counting dimensions from the back.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> (non-differentiable) : T</dt>
<dd>An input tensor.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>shape</tt> (non-differentiable) : T1</dt>
<dd>Shape of the input tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4)</dt>
<dd>Input tensor can be of arbitrary type.</dd>
<dt><tt>T1</tt> : tensor(int64)</dt>
<dd>Constrain output to int64 tensor.</dd>
</dl>


#### Examples

<details>
<summary>shape</summary>

```python
x = np.array(
    [
        [1, 2, 3],
        [4, 5, 6],
    ]
).astype(np.float32)
test_shape("_example", x)  # preserve names of original test cases

x = np.random.randn(3, 4, 5).astype(np.float32)

test_shape("", x)  # preserve names of original test cases

test_shape("_start_1", x, start=1)

test_shape("_end_1", x, end=1)

test_shape("_start_negative_1", x, start=-1)

test_shape("_end_negative_1", x, end=-1)

test_shape("_start_1_end_negative_1", x, start=1, end=-1)

test_shape("_start_1_end_2", x, start=1, end=2)

test_shape("_clip_start", x, start=-10)

test_shape("_clip_end", x, end=10)
```

</details>


### <a name="Shrink"></a><a name="shrink">**Shrink**</a>

  Shrink takes one input data (Tensor<numeric>) and produces one Tensor output,
  having same datatype and shape with input. It has two attributes, lambd and
  bias. The formula of this operator is: If x < -lambd, y = x + bias;
  If x > lambd, y = x - bias; Otherwise, y = 0.

#### Version

This version of the operator has been available since version 9 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>bias</tt> : float (default is 0.0)</dt>
<dd>The bias value added to output. Default is 0.</dd>
<dt><tt>lambd</tt> : float (default is 0.5)</dt>
<dd>The lambd value for the Shrink formulation. Default is 0.5.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>The input data as Tensor.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>The output.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input to only numeric types.</dd>
</dl>


#### Examples

<details>
<summary>hard_shrink</summary>

```python
node = onnx.helper.make_node(
    "Shrink",
    inputs=["x"],
    outputs=["y"],
    lambd=1.5,
)
X = np.arange(-2.0, 2.1, dtype=np.float32)
Y = np.array([-2, 0, 0, 0, 2], dtype=np.float32)
expect(node, inputs=[X], outputs=[Y], name="test_shrink_hard")
```

</details>


<details>
<summary>soft_shrink</summary>

```python
node = onnx.helper.make_node(
    "Shrink",
    inputs=["x"],
    outputs=["y"],
    lambd=1.5,
    bias=1.5,
)
X = np.arange(-2.0, 2.1, dtype=np.float32)
Y = np.array([-0.5, 0, 0, 0, 0.5], dtype=np.float32)
expect(node, inputs=[X], outputs=[Y], name="test_shrink_soft")
```

</details>


### <a name="Sigmoid"></a><a name="sigmoid">**Sigmoid**</a>

  Sigmoid takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the sigmoid function, y = 1 / (1 + exp(-x)), is applied to the
  tensor elementwise.

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Sigmoid-1">1</a>, <a href="Changelog.md#Sigmoid-6">6</a>

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>sigmoid</summary>

```python
node = onnx.helper.make_node(
    "Sigmoid",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([-1, 0, 1]).astype(np.float32)
y = 1.0 / (
    1.0 + np.exp(np.negative(x))
)  # expected output [0.26894143, 0.5, 0.7310586]
expect(node, inputs=[x], outputs=[y], name="test_sigmoid_example")

x = np.random.randn(3, 4, 5).astype(np.float32)
y = 1.0 / (1.0 + np.exp(np.negative(x)))
expect(node, inputs=[x], outputs=[y], name="test_sigmoid")
```

</details>


### <a name="Sign"></a><a name="sign">**Sign**</a>

  Calculate the sign of the given input tensor element-wise.
  If input > 0, output 1. if input < 0, output -1. if input == 0, output 0.

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Sign-9">9</a>

#### Inputs

<dl>
<dt><tt>input</tt> (non-differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (non-differentiable) : T</dt>
<dd>The sign of the input tensor computed element-wise. It has the same shape and type of the input.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to all numeric tensors.</dd>
</dl>


#### Examples

<details>
<summary>sign</summary>

```python
node = onnx.helper.make_node(
    "Sign",
    inputs=["x"],
    outputs=["y"],
)

x = np.array(range(-5, 6)).astype(np.float32)
y = np.sign(x)
expect(node, inputs=[x], outputs=[y], name="test_sign")
```

</details>


### <a name="Sin"></a><a name="sin">**Sin**</a>

  Calculates the sine of the given input tensor, element-wise.

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>The sine of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>sin</summary>

```python
node = onnx.helper.make_node(
    "Sin",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([-1, 0, 1]).astype(np.float32)
y = np.sin(x)
expect(node, inputs=[x], outputs=[y], name="test_sin_example")

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.sin(x)
expect(node, inputs=[x], outputs=[y], name="test_sin")
```

</details>


### <a name="Sinh"></a><a name="sinh">**Sinh**</a>

  Calculates the hyperbolic sine of the given input tensor element-wise.

#### Version

This version of the operator has been available since version 9 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>The hyperbolic sine values of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>sinh</summary>

```python
node = onnx.helper.make_node(
    "Sinh",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([-1, 0, 1]).astype(np.float32)
y = np.sinh(x)  # expected output [-1.17520118,  0.,  1.17520118]
expect(node, inputs=[x], outputs=[y], name="test_sinh_example")

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.sinh(x)
expect(node, inputs=[x], outputs=[y], name="test_sinh")
```

</details>


### <a name="Size"></a><a name="size">**Size**</a>

  Takes a tensor as input and outputs a int64 scalar that equals to the total number of elements of the input tensor.

#### Version

This version of the operator has been available since version 21 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Size-1">1</a>, <a href="Changelog.md#Size-13">13</a>, <a href="Changelog.md#Size-19">19</a>

#### Inputs

<dl>
<dt><tt>data</tt> (non-differentiable) : T</dt>
<dd>An input tensor.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>size</tt> (non-differentiable) : T1</dt>
<dd>Total number of elements of the input tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4)</dt>
<dd>Input tensor can be of arbitrary type.</dd>
<dt><tt>T1</tt> : tensor(int64)</dt>
<dd>Constrain output to int64 tensor, which should be a scalar though.</dd>
</dl>


#### Examples

<details>
<summary>size</summary>

```python
node = onnx.helper.make_node(
    "Size",
    inputs=["x"],
    outputs=["y"],
)

x = np.array(
    [
        [1, 2, 3],
        [4, 5, 6],
    ]
).astype(np.float32)
y = np.array(6).astype(np.int64)

expect(node, inputs=[x], outputs=[y], name="test_size_example")

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.array(x.size).astype(np.int64)

expect(node, inputs=[x], outputs=[y], name="test_size")
```

</details>


### <a name="Slice"></a><a name="slice">**Slice**</a>

  Produces a slice of the input tensor along multiple axes. Similar to numpy:
  https://numpy.org/doc/stable/user/basics.indexing.html?highlight=slice#slicing-and-striding

  Slice uses the `starts`, `ends`, `axes` and `steps` inputs to select a sub-tensor
  of its input `data` tensor.

  An effective `starts[i]`, `ends[i]`, and `steps[i]` must be computed for each `i`
  in `[0, ... r-1]` where `r = rank(input)` as follows:

  If `axes` are omitted, they are set to `[0, ..., r-1]`.
  If `steps` are omitted, they are set to `[1, ..., 1]` of length `len(starts)`

  The effective values are initialized as `start[i] = 0`, `ends[i] = dims[i]` where
  `dims` are the dimensions of `input` and `steps[i] = 1`.

  All negative elements of `axes` are made non-negative by adding `r` to them, where
  `r =rank(input)`.

  All negative values in `starts[i]` and `ends[i]` have `dims[axes[i]]` added to them,
  where `dims` are the dimensions of `input`. Then `start[axes[i]]` is the adjusted
  `starts[i]` is clamped into the range `[0, dims[axes[i]]]` for positive stepping
  and `[0, dims[axes[i]]-1]` for negative stepping.

  The clamping for the adjusted `ends[i]` depends on the sign of `steps[i]` and must
  accommodate copying 0 through `dims[axes[i]]` elements, so for positive stepping
  `ends[axes[i]]` is clamped to `[0, dims[axes[i]]]`, while for negative stepping it
  is clamped to `[-1, dims[axes[i]]-1]`.

  Finally, `steps[axes[i]] = steps[i]`.

  For slicing to the end of a dimension with unknown size, it is recommended to pass
  in `INT_MAX` when slicing forward and 'INT_MIN' when slicing backward.

  Example 1:

  ```
  data = [
      [1, 2, 3, 4],
      [5, 6, 7, 8],
  ]
  axes = [0, 1]
  starts = [1, 0]
  ends = [2, 3]
  steps = [1, 2]
  result = [
      [5, 7],
  ]
  ```

  Example 2:

  ```
  data = [
      [1, 2, 3, 4],
      [5, 6, 7, 8],
  ]
  starts = [0, 1]
  ends = [-1, 1000]
  result = [
      [2, 3, 4],
  ]
  ```

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Slice-1">1</a>, <a href="Changelog.md#Slice-10">10</a>, <a href="Changelog.md#Slice-11">11</a>

#### Inputs (3 - 5)

<dl>
<dt><tt>data</tt> (differentiable) : T</dt>
<dd>Tensor of data to extract slices from.</dd>
<dt><tt>starts</tt> (non-differentiable) : Tind</dt>
<dd>1-D tensor of starting indices of corresponding axis in `axes`</dd>
<dt><tt>ends</tt> (non-differentiable) : Tind</dt>
<dd>1-D tensor of ending indices (exclusive) of corresponding axis in `axes`</dd>
<dt><tt>axes</tt> (optional, non-differentiable) : Tind</dt>
<dd>1-D tensor of axes that `starts` and `ends` apply to. Negative value means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(data). Behavior is undefined if an axis is repeated.</dd>
<dt><tt>steps</tt> (optional, non-differentiable) : Tind</dt>
<dd>1-D tensor of slice step of corresponding axis in `axes`. Negative value means slicing backward. 'steps' cannot be 0. Defaults to 1s.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>Sliced data tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain input and output types to all tensor types.</dd>
<dt><tt>Tind</tt> : tensor(int32), tensor(int64)</dt>
<dd>Constrain indices to integer types</dd>
</dl>


#### Examples

<details>
<summary>slice</summary>

```python
node = onnx.helper.make_node(
    "Slice",
    inputs=["x", "starts", "ends", "axes", "steps"],
    outputs=["y"],
)

x = np.random.randn(20, 10, 5).astype(np.float32)
y = x[0:3, 0:10]
starts = np.array([0, 0], dtype=np.int64)
ends = np.array([3, 10], dtype=np.int64)
axes = np.array([0, 1], dtype=np.int64)
steps = np.array([1, 1], dtype=np.int64)

expect(
    node, inputs=[x, starts, ends, axes, steps], outputs=[y], name="test_slice"
)
```

</details>


<details>
<summary>slice_default_axes</summary>

```python
node = onnx.helper.make_node(
    "Slice",
    inputs=["x", "starts", "ends"],
    outputs=["y"],
)

x = np.random.randn(20, 10, 5).astype(np.float32)
starts = np.array([0, 0, 3], dtype=np.int64)
ends = np.array([20, 10, 4], dtype=np.int64)
y = x[:, :, 3:4]

expect(
    node, inputs=[x, starts, ends], outputs=[y], name="test_slice_default_axes"
)
```

</details>


<details>
<summary>slice_default_steps</summary>

```python
node = onnx.helper.make_node(
    "Slice",
    inputs=["x", "starts", "ends", "axes"],
    outputs=["y"],
)

x = np.random.randn(20, 10, 5).astype(np.float32)
starts = np.array([0, 0, 3], dtype=np.int64)
ends = np.array([20, 10, 4], dtype=np.int64)
axes = np.array([0, 1, 2], dtype=np.int64)
y = x[:, :, 3:4]

expect(
    node,
    inputs=[x, starts, ends, axes],
    outputs=[y],
    name="test_slice_default_steps",
)
```

</details>


<details>
<summary>slice_end_out_of_bounds</summary>

```python
node = onnx.helper.make_node(
    "Slice",
    inputs=["x", "starts", "ends", "axes", "steps"],
    outputs=["y"],
)

x = np.random.randn(20, 10, 5).astype(np.float32)
starts = np.array([1], dtype=np.int64)
ends = np.array([1000], dtype=np.int64)
axes = np.array([1], dtype=np.int64)
steps = np.array([1], dtype=np.int64)
y = x[:, 1:1000]

expect(
    node,
    inputs=[x, starts, ends, axes, steps],
    outputs=[y],
    name="test_slice_end_out_of_bounds",
)
```

</details>


<details>
<summary>slice_neg</summary>

```python
node = onnx.helper.make_node(
    "Slice",
    inputs=["x", "starts", "ends", "axes", "steps"],
    outputs=["y"],
)

x = np.random.randn(20, 10, 5).astype(np.float32)
starts = np.array([0], dtype=np.int64)
ends = np.array([-1], dtype=np.int64)
axes = np.array([1], dtype=np.int64)
steps = np.array([1], dtype=np.int64)
y = x[:, 0:-1]

expect(
    node,
    inputs=[x, starts, ends, axes, steps],
    outputs=[y],
    name="test_slice_neg",
)
```

</details>


<details>
<summary>slice_neg_steps</summary>

```python
node = onnx.helper.make_node(
    "Slice",
    inputs=["x", "starts", "ends", "axes", "steps"],
    outputs=["y"],
)

x = np.random.randn(20, 10, 5).astype(np.float32)
starts = np.array([20, 10, 4], dtype=np.int64)
ends = np.array([0, 0, 1], dtype=np.int64)
axes = np.array([0, 1, 2], dtype=np.int64)
steps = np.array([-1, -3, -2]).astype(np.int64)
y = x[20:0:-1, 10:0:-3, 4:1:-2]

expect(
    node,
    inputs=[x, starts, ends, axes, steps],
    outputs=[y],
    name="test_slice_neg_steps",
)
```

</details>


<details>
<summary>slice_negative_axes</summary>

```python
node = onnx.helper.make_node(
    "Slice",
    inputs=["x", "starts", "ends", "axes"],
    outputs=["y"],
)

x = np.random.randn(20, 10, 5).astype(np.float32)
starts = np.array([0, 0, 3], dtype=np.int64)
ends = np.array([20, 10, 4], dtype=np.int64)
axes = np.array([0, -2, -1], dtype=np.int64)
y = x[:, :, 3:4]

expect(
    node,
    inputs=[x, starts, ends, axes],
    outputs=[y],
    name="test_slice_negative_axes",
)
```

</details>


<details>
<summary>slice_start_out_of_bounds</summary>

```python
node = onnx.helper.make_node(
    "Slice",
    inputs=["x", "starts", "ends", "axes", "steps"],
    outputs=["y"],
)

x = np.random.randn(20, 10, 5).astype(np.float32)
starts = np.array([1000], dtype=np.int64)
ends = np.array([1000], dtype=np.int64)
axes = np.array([1], dtype=np.int64)
steps = np.array([1], dtype=np.int64)
y = x[:, 1000:1000]

expect(
    node,
    inputs=[x, starts, ends, axes, steps],
    outputs=[y],
    name="test_slice_start_out_of_bounds",
)
```

</details>


### <a name="Softmax"></a><a name="softmax">**Softmax**</a>

  The operator computes the normalized exponential values for the given input:

   Softmax(input, axis) = Exp(input) / ReduceSum(Exp(input), axis=axis, keepdims=1)

  The "axis" attribute indicates the dimension along which Softmax
  will be performed. The output tensor has the same shape
  and contains the Softmax values of the corresponding input.

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Softmax-1">1</a>, <a href="Changelog.md#Softmax-11">11</a>

#### Attributes

<dl>
<dt><tt>axis</tt> : int (default is -1)</dt>
<dd>
Describes the dimension Softmax will be performed on.
Negative value means counting dimensions
from the back. Accepted range is [-r, r-1] where r = rank(input).
</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>The input tensor of rank >= axis.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>The output values with the same shape as the input tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>softmax</summary>

```python
node = onnx.helper.make_node(
    "Softmax",
    inputs=["x"],
    outputs=["y"],
)
x = np.array([[-1, 0, 1]]).astype(np.float32)
# expected output [[0.09003058, 0.24472848, 0.66524094]]
y = softmax(x, axis=1)
expect(node, inputs=[x], outputs=[y], name="test_softmax_example")
```

</details>


<details>
<summary>softmax_axis</summary>

```python
x = np.array([[0, 1, 2, 3], [10000, 10001, 10002, 10003]]).astype(np.float32)
# expected output
# [[0.032058604 0.08714432  0.23688284  0.6439143  ]
# [0.032058604 0.08714432  0.23688284  0.6439143  ]]
y = softmax(x)

node = onnx.helper.make_node(
    "Softmax",
    inputs=["x"],
    outputs=["y"],
)
expect(node, inputs=[x], outputs=[y], name="test_softmax_large_number")

x = np.abs(np.random.randn(3, 4, 5).astype(np.float32))
node = onnx.helper.make_node(
    "Softmax",
    inputs=["x"],
    outputs=["y"],
    axis=0,
)
y = softmax(x, axis=0)
expect(node, inputs=[x], outputs=[y], name="test_softmax_axis_0")

node = onnx.helper.make_node(
    "Softmax",
    inputs=["x"],
    outputs=["y"],
    axis=1,
)
y = softmax(x, axis=1)
expect(node, inputs=[x], outputs=[y], name="test_softmax_axis_1")

node = onnx.helper.make_node(
    "Softmax",
    inputs=["x"],
    outputs=["y"],
    axis=2,
)
y = softmax(x, axis=2)
expect(node, inputs=[x], outputs=[y], name="test_softmax_axis_2")

node = onnx.helper.make_node(
    "Softmax",
    inputs=["x"],
    outputs=["y"],
    axis=-1,
)
y = softmax(x, axis=-1)
expect(node, inputs=[x], outputs=[y], name="test_softmax_negative_axis")

# default axis is -1
node = onnx.helper.make_node(
    "Softmax",
    inputs=["x"],
    outputs=["y"],
)
expect(node, inputs=[x], outputs=[y], name="test_softmax_default_axis")
```

</details>


### <a name="SoftmaxCrossEntropyLoss"></a><a name="softmaxcrossentropyloss">**SoftmaxCrossEntropyLoss**</a>

  Loss function that measures the softmax cross entropy
  between 'scores' and 'labels'.
  This operator first computes a loss tensor whose shape is identical to the labels input.
  If the input is 2-D with shape (N, C), the loss tensor may be a N-element vector L = (l_1, l_2, ..., l_N).
  If the input is N-D tensor with shape (N, C, D1, D2, ..., Dk),
  the loss tensor L may have (N, D1, D2, ..., Dk) as its shape and L[i,][j_1][j_2]...[j_k] denotes a scalar element in L.
  After L is available, this operator can optionally do a reduction operator.

  * shape(scores): (N, C) where C is the number of classes, or (N, C, D1, D2,..., Dk),
    with K >= 1 in case of K-dimensional loss.
  * shape(labels): (N) where each value is 0 <= labels[i] <= C-1, or (N, D1, D2,..., Dk),
    with K >= 1 in case of K-dimensional loss.

  The loss for one sample, l_i, can calculated as follows:
  ```
  l[i][d1][d2]...[dk] = -y[i][c][d1][d2]..[dk], where i is the index of classes.
  ```
  or
  ```
  l[i][d1][d2]...[dk] = -y[i][c][d1][d2]..[dk] * weights[c], if 'weights' is provided.
  ```

  loss is zero for the case when label-value equals ignore_index.
  ```
  l[i][d1][d2]...[dk]  = 0, when labels[n][d1][d2]...[dk] = ignore_index
  ```

  where:
  ```
  p = Softmax(scores)
  y = Log(p)
  c = labels[i][d1][d2]...[dk]
  ```

  Finally, L is optionally reduced:

  * If reduction = 'none', the output is L with shape (N, D1, D2, ..., Dk).
  * If reduction = 'sum', the output is scalar: Sum(L).
  * If reduction = 'mean', the output is scalar: ReduceMean(L), or if weight is provided: `ReduceSum(L) / ReduceSum(W)`,
    where tensor W is of shape `(N, D1, D2, ..., Dk)` and `W[n][d1][d2]...[dk] = weights[labels[i][d1][d2]...[dk]]`.

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#SoftmaxCrossEntropyLoss-12">12</a>

#### Attributes

<dl>
<dt><tt>ignore_index</tt> : int</dt>
<dd>Specifies a target value that is ignored and does not contribute to the input gradient. It's an optional value.</dd>
<dt><tt>reduction</tt> : string (default is mean)</dt>
<dd>Type of reduction to apply to loss: none, sum, mean(default). 'none': no reduction will be applied, 'sum': the output will be summed. 'mean': the sum of the output will be divided by the number of elements in the output.</dd>
</dl>

#### Inputs (2 - 3)

<dl>
<dt><tt>scores</tt> (differentiable) : T</dt>
<dd>The predicted outputs with shape [batch_size, class_size], or [batch_size, class_size, D1, D2 , ..., Dk], where K is the number of dimensions.</dd>
<dt><tt>labels</tt> (non-differentiable) : Tind</dt>
<dd>The ground truth output tensor, with shape [batch_size], or [batch_size, D1, D2, ..., Dk], where K is the number of dimensions. Labels element value shall be in range of [0, C). If ignore_index is specified, it may have a value outside [0, C) and the label values should either be in the range [0, C) or have the value ignore_index.</dd>
<dt><tt>weights</tt> (optional, non-differentiable) : T</dt>
<dd>A manual rescaling weight given to each class. If given, it has to be a 1D Tensor assigning weight to each of the classes. Otherwise, it is treated as if having all ones.</dd>
</dl>

#### Outputs (1 - 2)

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>Weighted loss float Tensor. If reduction is 'none', this has the shape of [batch_size], or [batch_size, D1, D2, ..., Dk] in case of K-dimensional loss. Otherwise, it is a scalar.</dd>
<dt><tt>log_prob</tt> (optional, differentiable) : T</dt>
<dd>Log probability tensor. If the output of softmax is prob, its value is log(prob).</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to float tensors.</dd>
<dt><tt>Tind</tt> : tensor(int32), tensor(int64)</dt>
<dd>Constrain target to integer types</dd>
</dl>


#### Examples

<details>
<summary>input_shape_is_NCd1_mean_weight_negative_ii</summary>

```python
reduction = "mean"
ignore_index = np.int64(-1)

node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y", "w"],
    outputs=["z"],
    reduction=reduction,
    ignore_index=ignore_index,
)

N, C, dim1 = 3, 5, 6
np.random.seed(0)
x = np.random.rand(N, C, dim1).astype(np.float32)
labels = np.random.randint(0, high=C, size=(N, dim1)).astype(np.int64)
labels[0][0] = -1
weight = np.random.rand(C).astype(np.float32)

sce = softmaxcrossentropy(
    x, labels, weight=weight, reduction=reduction, ignore_index=ignore_index
)

expect(
    node,
    inputs=[x, labels, weight],
    outputs=[sce],
    name="test_sce_NCd1_mean_weight_negative_ii",
)
```

</details>


<details>
<summary>input_shape_is_NCd1_mean_weight_negative_ii_log_prob</summary>

```python
reduction = "mean"
ignore_index = np.int64(-1)

node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y", "w"],
    outputs=["z", "log_prob"],
    reduction=reduction,
    ignore_index=ignore_index,
)

N, C, dim1 = 3, 5, 6
np.random.seed(0)
x = np.random.rand(N, C, dim1).astype(np.float32)
labels = np.random.randint(0, high=C, size=(N, dim1)).astype(np.int64)
labels[0][0] = -1
weight = np.random.rand(C).astype(np.float32)

loss, log_prob = softmaxcrossentropy(
    x,
    labels,
    weight=weight,
    reduction=reduction,
    ignore_index=ignore_index,
    get_log_prob=True,
)

expect(
    node,
    inputs=[x, labels, weight],
    outputs=[loss, log_prob],
    name="test_sce_NCd1_mean_weight_negative_ii_log_prob",
)
```

</details>


<details>
<summary>input_shape_is_NCd1d2d3_none_no_weight_negative_ii</summary>

```python
reduction = "none"
ignore_index = np.int64(-5)

node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y"],
    outputs=["z"],
    reduction=reduction,
    ignore_index=ignore_index,
)

N, C, dim1, dim2, dim3 = 3, 5, 6, 6, 5
np.random.seed(0)
x = np.random.rand(N, C, dim1, dim2, dim3).astype(np.float32)
labels = np.random.randint(0, high=C, size=(N, dim1, dim2, dim3)).astype(
    np.int64
)
labels[0][0][0][0] = -5

sce = softmaxcrossentropy(
    x, labels, reduction=reduction, ignore_index=ignore_index
)

expect(
    node,
    inputs=[x, labels],
    outputs=[sce],
    name="test_sce_NCd1d2d3_none_no_weight_negative_ii",
)
```

</details>


<details>
<summary>input_shape_is_NCd1d2d3_none_no_weight_negative_ii_log_prob</summary>

```python
reduction = "none"
ignore_index = np.int64(-5)

node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y"],
    outputs=["z", "log_prob"],
    reduction=reduction,
    ignore_index=ignore_index,
)

N, C, dim1, dim2, dim3 = 3, 5, 6, 6, 5
np.random.seed(0)
x = np.random.rand(N, C, dim1, dim2, dim3).astype(np.float32)
labels = np.random.randint(0, high=C, size=(N, dim1, dim2, dim3)).astype(
    np.int64
)
labels[0][0][0][0] = -5

loss, log_prob = softmaxcrossentropy(
    x, labels, reduction=reduction, ignore_index=ignore_index, get_log_prob=True
)

expect(
    node,
    inputs=[x, labels],
    outputs=[loss, log_prob],
    name="test_sce_NCd1d2d3_none_no_weight_negative_ii_log_prob",
)
```

</details>


<details>
<summary>input_shape_is_NCd1d2d3_sum_weight_high_ii</summary>

```python
reduction = "sum"
ignore_index = np.int64(10)

node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y", "w"],
    outputs=["z"],
    reduction=reduction,
    ignore_index=ignore_index,
)

N, C = 3, 5
np.random.seed(0)
x = np.random.rand(N, C).astype(np.float32)
labels = np.random.randint(0, high=C, size=(N)).astype(np.int64)
labels[0] = 10
weight = np.random.rand(C).astype(np.float32)

sce = softmaxcrossentropy(
    x, labels, weight=weight, reduction=reduction, ignore_index=ignore_index
)

expect(
    node,
    inputs=[x, labels, weight],
    outputs=[sce],
    name="test_sce_NCd1d2d3_sum_weight_high_ii",
)
```

</details>


<details>
<summary>input_shape_is_NCd1d2d3_sum_weight_high_ii_log_prob</summary>

```python
reduction = "sum"
ignore_index = np.int64(10)

node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y", "w"],
    outputs=["z", "log_prob"],
    reduction=reduction,
    ignore_index=ignore_index,
)

N, C = 3, 5
np.random.seed(0)
x = np.random.rand(N, C).astype(np.float32)
labels = np.random.randint(0, high=C, size=(N)).astype(np.int64)
labels[0] = 10
weight = np.random.rand(C).astype(np.float32)

loss, log_prob = softmaxcrossentropy(
    x,
    labels,
    weight=weight,
    reduction=reduction,
    ignore_index=ignore_index,
    get_log_prob=True,
)

expect(
    node,
    inputs=[x, labels, weight],
    outputs=[loss, log_prob],
    name="test_sce_NCd1d2d3_sum_weight_high_ii_log_prob",
)
```

</details>


<details>
<summary>input_shape_is_NCd1d2d3d4d5_mean_weight</summary>

```python
reduction = "mean"

node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y", "w"],
    outputs=["z"],
    reduction=reduction,
)

N, C, dim1, dim2, dim3, dim4, dim5 = 3, 5, 6, 6, 5, 3, 4
np.random.seed(0)
x = np.random.rand(N, C, dim1, dim2, dim3, dim4, dim5).astype(np.float32)
labels = np.random.randint(
    0, high=C, size=(N, dim1, dim2, dim3, dim4, dim5)
).astype(np.int64)
weight = np.random.rand(C).astype(np.float32)

sce = softmaxcrossentropy(x, labels, weight=weight, reduction=reduction)

expect(
    node,
    inputs=[x, labels, weight],
    outputs=[sce],
    name="test_sce_NCd1d2d3d4d5_mean_weight",
)
```

</details>


<details>
<summary>input_shape_is_NCd1d2d3d4d5_mean_weight_log_prob</summary>

```python
reduction = "mean"

node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y", "w"],
    outputs=["z", "log_prob"],
    reduction=reduction,
)

N, C, dim1, dim2, dim3, dim4, dim5 = 3, 5, 6, 6, 5, 3, 4
np.random.seed(0)
x = np.random.rand(N, C, dim1, dim2, dim3, dim4, dim5).astype(np.float32)
labels = np.random.randint(
    0, high=C, size=(N, dim1, dim2, dim3, dim4, dim5)
).astype(np.int64)
weight = np.random.rand(C).astype(np.float32)

loss, log_prob = softmaxcrossentropy(
    x, labels, weight=weight, reduction=reduction, get_log_prob=True
)

expect(
    node,
    inputs=[x, labels, weight],
    outputs=[loss, log_prob],
    name="test_sce_NCd1d2d3d4d5_mean_weight_log_prob",
)
```

</details>


<details>
<summary>input_shape_is_NCd1d2d3d4d5_none_no_weight</summary>

```python
reduction = "none"

node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y"],
    outputs=["z"],
    reduction=reduction,
)

N, C, dim1, dim2, dim3, dim4, dim5 = 3, 5, 6, 6, 5, 3, 4
np.random.seed(0)
x = np.random.rand(N, C, dim1, dim2, dim3, dim4, dim5).astype(np.float32)
labels = np.random.randint(
    0, high=C, size=(N, dim1, dim2, dim3, dim4, dim5)
).astype(np.int64)

sce = softmaxcrossentropy(x, labels, reduction=reduction)

expect(
    node,
    inputs=[x, labels],
    outputs=[sce],
    name="test_sce_NCd1d2d3d4d5_none_no_weight",
)
```

</details>


<details>
<summary>input_shape_is_NCd1d2d3d4d5_none_no_weight_log_prob</summary>

```python
reduction = "none"

node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y"],
    outputs=["z", "log_prob"],
    reduction=reduction,
)

N, C, dim1, dim2, dim3, dim4, dim5 = 3, 5, 6, 6, 5, 3, 4
np.random.seed(0)
x = np.random.rand(N, C, dim1, dim2, dim3, dim4, dim5).astype(np.float32)
labels = np.random.randint(
    0, high=C, size=(N, dim1, dim2, dim3, dim4, dim5)
).astype(np.int64)

loss, log_prob = softmaxcrossentropy(
    x, labels, reduction=reduction, get_log_prob=True
)

expect(
    node,
    inputs=[x, labels],
    outputs=[loss, log_prob],
    name="test_sce_NCd1d2d3d4d5_none_no_weight_log_prob",
)
```

</details>


<details>
<summary>softmaxcrossentropy_mean</summary>

```python
# Define operator attributes.
reduction = "mean"

# Create operator.
node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y"],
    outputs=["z"],
    reduction=reduction,
)

# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3,)).astype(np.int64)

# Compute SoftmaxCrossEntropyLoss
sce = softmaxcrossentropy(x, labels)

# Check results
expect(node, inputs=[x, labels], outputs=[sce], name="test_sce_mean")
```

</details>


<details>
<summary>softmaxcrossentropy_mean_3d</summary>

```python
# Define operator attributes.
reduction = "mean"

# Create operator.
node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y"],
    outputs=["z"],
    reduction=reduction,
)

# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5, 2).astype(np.float32)
y = np.random.randint(0, high=5, size=(3, 2)).astype(np.int64)

# Compute SoftmaxCrossEntropyLoss
sce = softmaxcrossentropy(x, y)

# Check results
expect(node, inputs=[x, y], outputs=[sce], name="test_sce_mean_3d")
```

</details>


<details>
<summary>softmaxcrossentropy_mean_3d_log_prob</summary>

```python
# Define operator attributes.
reduction = "mean"

# Create operator.
node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y"],
    outputs=["z", "log_prob"],
    reduction=reduction,
)

# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5, 2).astype(np.float32)
y = np.random.randint(0, high=5, size=(3, 2)).astype(np.int64)

# Compute SoftmaxCrossEntropyLoss
loss, log_prob = softmaxcrossentropy(x, y, get_log_prob=True)

# Check results
expect(
    node,
    inputs=[x, y],
    outputs=[loss, log_prob],
    name="test_sce_mean_3d_log_prob",
)
```

</details>


<details>
<summary>softmaxcrossentropy_mean_log_prob</summary>

```python
# Define operator attributes.
reduction = "mean"

# Create operator.
node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y"],
    outputs=["z", "log_prob"],
    reduction=reduction,
)

# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3,)).astype(np.int64)

# Compute SoftmaxCrossEntropyLoss
loss, log_prob = softmaxcrossentropy(x, labels, get_log_prob=True)

# Check results
expect(
    node,
    inputs=[x, labels],
    outputs=[loss, log_prob],
    name="test_sce_mean_log_prob",
)
```

</details>


<details>
<summary>softmaxcrossentropy_mean_no_weights_ii</summary>

```python
# Define operator attributes.
reduction = "mean"
ignore_index = np.int64(2)

# Create operator.
node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y"],
    outputs=["z"],
    reduction=reduction,
    ignore_index=ignore_index,
)

# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3,)).astype(np.int64)
labels[0] = np.int64(2)

# Compute SoftmaxCrossEntropyLoss
sce = softmaxcrossentropy(x, labels, ignore_index=ignore_index)

# Check results
expect(
    node, inputs=[x, labels], outputs=[sce], name="test_sce_mean_no_weight_ii"
)
```

</details>


<details>
<summary>softmaxcrossentropy_mean_no_weights_ii_3d</summary>

```python
# Define operator attributes.
reduction = "mean"
ignore_index = np.int64(2)

# Create operator.
node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y"],
    outputs=["z"],
    reduction=reduction,
    ignore_index=ignore_index,
)

# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5, 2).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3, 2)).astype(np.int64)
labels[0][0] = np.int64(2)

# Compute SoftmaxCrossEntropyLoss
sce = softmaxcrossentropy(x, labels, ignore_index=ignore_index)

# Check results
expect(
    node,
    inputs=[x, labels],
    outputs=[sce],
    name="test_sce_mean_no_weight_ii_3d",
)
```

</details>


<details>
<summary>softmaxcrossentropy_mean_no_weights_ii_3d_log_prob</summary>

```python
# Define operator attributes.
reduction = "mean"
ignore_index = np.int64(2)

# Create operator.
node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y"],
    outputs=["z", "log_prob"],
    reduction=reduction,
    ignore_index=ignore_index,
)

# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5, 2).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3, 2)).astype(np.int64)
labels[0][0] = np.int64(2)

# Compute SoftmaxCrossEntropyLoss
loss, log_prob = softmaxcrossentropy(
    x, labels, ignore_index=ignore_index, get_log_prob=True
)

# Check results
expect(
    node,
    inputs=[x, labels],
    outputs=[loss, log_prob],
    name="test_sce_mean_no_weight_ii_3d_log_prob",
)
```

</details>


<details>
<summary>softmaxcrossentropy_mean_no_weights_ii_4d</summary>

```python
# Define operator attributes.
reduction = "mean"
ignore_index = np.int64(2)

# Create operator.
node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y"],
    outputs=["z"],
    reduction=reduction,
    ignore_index=ignore_index,
)

# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5, 2, 7).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3, 2, 7)).astype(np.int64)
labels[0][0][0] = np.int64(2)

# Compute SoftmaxCrossEntropyLoss
sce = softmaxcrossentropy(
    x, labels, reduction=reduction, ignore_index=ignore_index
)

# Check results
expect(
    node,
    inputs=[x, labels],
    outputs=[sce],
    name="test_sce_mean_no_weight_ii_4d",
)
```

</details>


<details>
<summary>softmaxcrossentropy_mean_no_weights_ii_4d_log_prob</summary>

```python
# Define operator attributes.
reduction = "mean"
ignore_index = np.int64(2)

# Create operator.
node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y"],
    outputs=["z", "log_prob"],
    reduction=reduction,
    ignore_index=ignore_index,
)

# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5, 2, 7).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3, 2, 7)).astype(np.int64)
labels[0][0][0] = np.int64(2)

# Compute SoftmaxCrossEntropyLoss
loss, log_prob = softmaxcrossentropy(
    x, labels, reduction=reduction, ignore_index=ignore_index, get_log_prob=True
)

# Check results
expect(
    node,
    inputs=[x, labels],
    outputs=[loss, log_prob],
    name="test_sce_mean_no_weight_ii_4d_log_prob",
)
```

</details>


<details>
<summary>softmaxcrossentropy_mean_no_weights_ii_log_prob</summary>

```python
# Define operator attributes.
reduction = "mean"
ignore_index = np.int64(2)

# Create operator.
node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y"],
    outputs=["z", "log_prob"],
    reduction=reduction,
    ignore_index=ignore_index,
)

# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3,)).astype(np.int64)
labels[0] = np.int64(2)

# Compute SoftmaxCrossEntropyLoss
loss, log_prob = softmaxcrossentropy(
    x, labels, ignore_index=ignore_index, get_log_prob=True
)

# Check results
expect(
    node,
    inputs=[x, labels],
    outputs=[loss, log_prob],
    name="test_sce_mean_no_weight_ii_log_prob",
)
```

</details>


<details>
<summary>softmaxcrossentropy_mean_weights</summary>

```python
# Define operator attributes.
reduction = "mean"

# Create operator.
node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y", "w"],
    outputs=["z"],
    reduction=reduction,
)

# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3,)).astype(np.int64)
weights = np.array([0.9, 0.7, 0.8, 0.9, 0.9], dtype=np.float32)

# Compute SoftmaxCrossEntropyLoss
sce = softmaxcrossentropy(x, labels, weight=weights)

# Check results
expect(
    node,
    inputs=[x, labels, weights],
    outputs=[sce],
    name="test_sce_mean_weight",
)
```

</details>


<details>
<summary>softmaxcrossentropy_mean_weights_ii</summary>

```python
# Define operator attributes.
reduction = "mean"
ignore_index = np.int64(0)

# Create operator.
node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y", "w"],
    outputs=["z"],
    reduction=reduction,
    ignore_index=ignore_index,
)

# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3,)).astype(np.int64)
labels[0] = np.int64(0)
weights = np.array([0.9, 0.7, 0.8, 0.9, 0.9], dtype=np.float32)

# Compute SoftmaxCrossEntropyLoss
sce = softmaxcrossentropy(x, labels, weight=weights, ignore_index=ignore_index)

# Check results
expect(
    node,
    inputs=[x, labels, weights],
    outputs=[sce],
    name="test_sce_mean_weight_ii",
)
```

</details>


<details>
<summary>softmaxcrossentropy_mean_weights_ii_3d</summary>

```python
# Define operator attributes.
reduction = "mean"
ignore_index = np.int64(1)

# Create operator.
node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y", "w"],
    outputs=["z"],
    reduction=reduction,
    ignore_index=ignore_index,
)

# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5, 2).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3, 2)).astype(np.int64)
labels[0][0] = np.int64(1)
weights = np.array([0.2, 0.3, 0.6, 0.1, 0.5], dtype=np.float32)

# Compute SoftmaxCrossEntropyLoss
sce = softmaxcrossentropy(x, labels, weight=weights, ignore_index=ignore_index)

# Check results
expect(
    node,
    inputs=[x, labels, weights],
    outputs=[sce],
    name="test_sce_mean_weight_ii_3d",
)
```

</details>


<details>
<summary>softmaxcrossentropy_mean_weights_ii_3d_log_prob</summary>

```python
# Define operator attributes.
reduction = "mean"
ignore_index = np.int64(1)

# Create operator.
node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y", "w"],
    outputs=["z", "log_prob"],
    reduction=reduction,
    ignore_index=ignore_index,
)

# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5, 2).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3, 2)).astype(np.int64)
labels[0][0] = np.int64(1)
weights = np.array([0.2, 0.3, 0.6, 0.1, 0.5], dtype=np.float32)

# Compute SoftmaxCrossEntropyLoss
loss, log_prob = softmaxcrossentropy(
    x, labels, weight=weights, ignore_index=ignore_index, get_log_prob=True
)

# Check results
expect(
    node,
    inputs=[x, labels, weights],
    outputs=[loss, log_prob],
    name="test_sce_mean_weight_ii_3d_log_prob",
)
```

</details>


<details>
<summary>softmaxcrossentropy_mean_weights_ii_4d</summary>

```python
# Define operator attributes.
reduction = "mean"
ignore_index = np.int64(2)

# Create operator.
node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y", "w"],
    outputs=["z"],
    reduction=reduction,
    ignore_index=ignore_index,
)

# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5, 2, 7).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3, 2, 7)).astype(np.int64)
labels[0][0][0] = np.int64(2)
weights = np.array([0.2, 0.3, 0.6, 0.1, 0.5], dtype=np.float32)

# Compute SoftmaxCrossEntropyLoss
sce = softmaxcrossentropy(
    x, labels, reduction=reduction, weight=weights, ignore_index=ignore_index
)

# Check results
expect(
    node,
    inputs=[x, labels, weights],
    outputs=[sce],
    name="test_sce_mean_weight_ii_4d",
)
```

</details>


<details>
<summary>softmaxcrossentropy_mean_weights_ii_4d_log_prob</summary>

```python
# Define operator attributes.
reduction = "mean"
ignore_index = np.int64(2)

# Create operator.
node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y", "w"],
    outputs=["z", "log_prob"],
    reduction=reduction,
    ignore_index=ignore_index,
)

# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5, 2, 7).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3, 2, 7)).astype(np.int64)
labels[0][0][0] = np.int64(2)
weights = np.array([0.2, 0.3, 0.6, 0.1, 0.5], dtype=np.float32)

# Compute SoftmaxCrossEntropyLoss
loss, log_prob = softmaxcrossentropy(
    x,
    labels,
    reduction=reduction,
    weight=weights,
    ignore_index=ignore_index,
    get_log_prob=True,
)

# Check results
expect(
    node,
    inputs=[x, labels, weights],
    outputs=[loss, log_prob],
    name="test_sce_mean_weight_ii_4d_log_prob",
)
```

</details>


<details>
<summary>softmaxcrossentropy_mean_weights_ii_log_prob</summary>

```python
# Define operator attributes.
reduction = "mean"
ignore_index = np.int64(0)

# Create operator.
node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y", "w"],
    outputs=["z", "log_prob"],
    reduction=reduction,
    ignore_index=ignore_index,
)

# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3,)).astype(np.int64)
labels[0] = np.int64(0)
weights = np.array([0.9, 0.7, 0.8, 0.9, 0.9], dtype=np.float32)

# Compute SoftmaxCrossEntropyLoss
loss, log_prob = softmaxcrossentropy(
    x, labels, weight=weights, ignore_index=ignore_index, get_log_prob=True
)

# Check results
expect(
    node,
    inputs=[x, labels, weights],
    outputs=[loss, log_prob],
    name="test_sce_mean_weight_ii_log_prob",
)
```

</details>


<details>
<summary>softmaxcrossentropy_mean_weights_log_prob</summary>

```python
# Define operator attributes.
reduction = "mean"

# Create operator.
node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y", "w"],
    outputs=["z", "log_prob"],
    reduction=reduction,
)

# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3,)).astype(np.int64)
weights = np.array([0.9, 0.7, 0.8, 0.9, 0.9], dtype=np.float32)

# Compute SoftmaxCrossEntropyLoss
loss, log_prob = softmaxcrossentropy(
    x, labels, weight=weights, get_log_prob=True
)

# Check results
expect(
    node,
    inputs=[x, labels, weights],
    outputs=[loss, log_prob],
    name="test_sce_mean_weight_log_prob",
)
```

</details>


<details>
<summary>softmaxcrossentropy_none</summary>

```python
# Define operator attributes.
reduction = "none"

# Create operator.
node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y"],
    outputs=["z"],
    reduction=reduction,
)

# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3,)).astype(np.int64)

# Compute SoftmaxCrossEntropyLoss
sce = softmaxcrossentropy(x, labels, reduction="none")

# Check results
expect(node, inputs=[x, labels], outputs=[sce], name="test_sce_none")
```

</details>


<details>
<summary>softmaxcrossentropy_none_log_prob</summary>

```python
# Define operator attributes.
reduction = "none"

# Create operator.
node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y"],
    outputs=["z", "log_prob"],
    reduction=reduction,
)

# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3,)).astype(np.int64)

# Compute SoftmaxCrossEntropyLoss
loss, log_prob = softmaxcrossentropy(
    x, labels, reduction="none", get_log_prob=True
)

# Check results
expect(
    node,
    inputs=[x, labels],
    outputs=[loss, log_prob],
    name="test_sce_none_log_prob",
)
```

</details>


<details>
<summary>softmaxcrossentropy_none_weights</summary>

```python
# Define operator attributes.
reduction = "none"

# Create operator.
node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y", "w"],
    outputs=["z"],
    reduction=reduction,
)

# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3,)).astype(np.int64)
weights = np.array([0.9, 0.7, 0.8, 0.9, 0.9], dtype=np.float32)

# Compute SoftmaxCrossEntropyLoss
sce = softmaxcrossentropy(x, labels, weight=weights, reduction="none")

# Check results
expect(
    node,
    inputs=[x, labels, weights],
    outputs=[sce],
    name="test_sce_none_weights",
)
```

</details>


<details>
<summary>softmaxcrossentropy_none_weights_log_prob</summary>

```python
# Define operator attributes.
reduction = "none"

# Create operator.
node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y", "w"],
    outputs=["z", "log_prob"],
    reduction=reduction,
)

# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3,)).astype(np.int64)
weights = np.array([0.9, 0.7, 0.8, 0.9, 0.9], dtype=np.float32)

# Compute SoftmaxCrossEntropyLoss
loss, log_prob = softmaxcrossentropy(
    x, labels, weight=weights, reduction="none", get_log_prob=True
)

# Check results
expect(
    node,
    inputs=[x, labels, weights],
    outputs=[loss, log_prob],
    name="test_sce_none_weights_log_prob",
)
```

</details>


<details>
<summary>softmaxcrossentropy_sum</summary>

```python
# Define operator attributes.
reduction = "sum"

# Create operator.
node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y"],
    outputs=["z"],
    reduction=reduction,
)

# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3,)).astype(np.int64)

# Compute SoftmaxCrossEntropyLoss
sce = softmaxcrossentropy(x, labels, reduction="sum")

# Check results
expect(node, inputs=[x, labels], outputs=[sce], name="test_sce_sum")
```

</details>


<details>
<summary>softmaxcrossentropy_sum_log_prob</summary>

```python
# Define operator attributes.
reduction = "sum"

# Create operator.
node = onnx.helper.make_node(
    "SoftmaxCrossEntropyLoss",
    inputs=["x", "y"],
    outputs=["z", "log_prob"],
    reduction=reduction,
)

# Define operator inputs.
np.random.seed(0)
x = np.random.rand(3, 5).astype(np.float32)
labels = np.random.randint(0, high=5, size=(3,)).astype(np.int64)

# Compute SoftmaxCrossEntropyLoss
loss, log_prob = softmaxcrossentropy(
    x, labels, reduction="sum", get_log_prob=True
)

# Check results
expect(
    node,
    inputs=[x, labels],
    outputs=[loss, log_prob],
    name="test_sce_sum_log_prob",
)
```

</details>


### <a name="Softplus"></a><a name="softplus">**Softplus**</a>

  Softplus takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the softplus function, y = ln(exp(x) + 1), is applied to
  the tensor elementwise.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>1D input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>1D input tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>softplus</summary>

```python
node = onnx.helper.make_node(
    "Softplus",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([-1, 0, 1]).astype(np.float32)
y = np.log(
    np.exp(x) + 1
)  # expected output [0.31326166, 0.69314718, 1.31326163]
expect(node, inputs=[x], outputs=[y], name="test_softplus_example")

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.log(np.exp(x) + 1)
expect(node, inputs=[x], outputs=[y], name="test_softplus")
```

</details>


### <a name="Softsign"></a><a name="softsign">**Softsign**</a>

  Calculates the softsign (x/(1+|x|)) of the given input tensor element-wise.

#### Version

This version of the operator has been available since version 1 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>The softsign (x/(1+|x|)) values of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>softsign</summary>

```python
node = onnx.helper.make_node(
    "Softsign",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([-1, 0, 1]).astype(np.float32)
y = np.array([-0.5, 0, 0.5]).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_softsign_example")

x = np.random.randn(3, 4, 5).astype(np.float32)
y = x / (1 + np.abs(x))
expect(node, inputs=[x], outputs=[y], name="test_softsign")
```

</details>


### <a name="SpaceToDepth"></a><a name="spacetodepth">**SpaceToDepth**</a>

  SpaceToDepth rearranges blocks of spatial data into depth. More specifically,
  this op outputs a copy of the input tensor where values from the height and width dimensions
  are moved to the depth dimension.

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#SpaceToDepth-1">1</a>

#### Attributes

<dl>
<dt><tt>blocksize</tt> : int (required)</dt>
<dd>Blocks of [blocksize, blocksize] are moved.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>Input tensor of [N,C,H,W], where N is the batch axis, C is the channel or depth, H is the height and W is the width.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>Output tensor of [N, C * blocksize * blocksize, H/blocksize, W/blocksize].</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain input and output types to all tensor types.</dd>
</dl>


#### Examples

<details>
<summary>example</summary>

```python
node = onnx.helper.make_node(
    "SpaceToDepth",
    inputs=["x"],
    outputs=["y"],
    blocksize=2,
)

# (1, 1, 4, 6) input tensor
x = np.array(
    [
        [
            [
                [0, 6, 1, 7, 2, 8],
                [12, 18, 13, 19, 14, 20],
                [3, 9, 4, 10, 5, 11],
                [15, 21, 16, 22, 17, 23],
            ]
        ]
    ]
).astype(np.float32)

# (1, 4, 2, 3) output tensor
y = np.array(
    [
        [
            [[0, 1, 2], [3, 4, 5]],
            [[6, 7, 8], [9, 10, 11]],
            [[12, 13, 14], [15, 16, 17]],
            [[18, 19, 20], [21, 22, 23]],
        ]
    ]
).astype(np.float32)
expect(node, inputs=[x], outputs=[y], name="test_spacetodepth_example")
```

</details>


<details>
<summary>spacetodepth</summary>

```python
b, c, h, w = shape = (2, 2, 6, 6)
blocksize = 2
node = onnx.helper.make_node(
    "SpaceToDepth",
    inputs=["x"],
    outputs=["y"],
    blocksize=blocksize,
)
x = np.random.random_sample(shape).astype(np.float32)
tmp = np.reshape(
    x, [b, c, h // blocksize, blocksize, w // blocksize, blocksize]
)
tmp = np.transpose(tmp, [0, 3, 5, 1, 2, 4])
y = np.reshape(tmp, [b, c * (blocksize**2), h // blocksize, w // blocksize])
expect(node, inputs=[x], outputs=[y], name="test_spacetodepth")
```

</details>


### <a name="Split"></a><a name="split">**Split**</a>

  Split a tensor into a list of tensors, along the specified 'axis'.
  Either input 'split' or the attribute 'num_outputs' should be specified, but not both.
  If the attribute 'num_outputs' is specified, then the tensor is split into equal sized parts.
  If the tensor is not evenly splittable into `num_outputs`, the last chunk will be smaller.
  If the input 'split' is specified, it indicates the sizes of each output in the split.

#### Version

This version of the operator has been available since version 18 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Split-1">1</a>, <a href="Changelog.md#Split-2">2</a>, <a href="Changelog.md#Split-11">11</a>, <a href="Changelog.md#Split-13">13</a>

#### Attributes

<dl>
<dt><tt>axis</tt> : int (default is 0)</dt>
<dd>Which axis to split on. A negative value means counting dimensions from the back. Accepted range is [-rank, rank-1] where r = rank(input).</dd>
<dt><tt>num_outputs</tt> : int</dt>
<dd>Number of outputs to split parts of the tensor into. If the tensor is not evenly splittable the last chunk will be smaller.</dd>
</dl>

#### Inputs (1 - 2)

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>The tensor to split</dd>
<dt><tt>split</tt> (optional, non-differentiable) : tensor(int64)</dt>
<dd>Optional length of each output. Values should be >= 0.Sum of the values must be equal to the dim value at 'axis' specified.</dd>
</dl>

#### Outputs (1 - &#8734;)

<dl>
<dt><tt>outputs</tt> (variadic, differentiable) : T</dt>
<dd>One or more outputs forming list of tensors after splitting</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain input and output types to all tensor types.</dd>
</dl>


#### Examples

<details>
<summary>1d_opset13</summary>

```python
node_input = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0]).astype(np.float32)

node = onnx.helper.make_node(
    "Split",
    inputs=["input"],
    outputs=["output_1", "output_2", "output_3"],
    axis=0,
)

expected_outputs = [
    np.array([1.0, 2.0]).astype(np.float32),
    np.array([3.0, 4.0]).astype(np.float32),
    np.array([5.0, 6.0]).astype(np.float32),
]
expect(
    node,
    inputs=[node_input],
    outputs=expected_outputs,
    name="test_split_equal_parts_1d_opset13",
    opset_imports=[onnx.helper.make_opsetid("", 13)],
)

split = np.array([2, 4]).astype(np.int64)
node = onnx.helper.make_node(
    "Split",
    inputs=["input", "split"],
    outputs=["output_1", "output_2"],
    axis=0,
)

expected_outputs = [
    np.array([1.0, 2.0]).astype(np.float32),
    np.array([3.0, 4.0, 5.0, 6.0]).astype(np.float32),
]
expect(
    node,
    inputs=[node_input, split],
    outputs=expected_outputs,
    name="test_split_variable_parts_1d_opset13",
    opset_imports=[onnx.helper.make_opsetid("", 13)],
)
```

</details>


<details>
<summary>1d_opset18</summary>

```python
node_input = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0]).astype(np.float32)

node = onnx.helper.make_node(
    "Split",
    inputs=["input"],
    outputs=["output_1", "output_2", "output_3"],
    axis=0,
    num_outputs=3,
)

expected_outputs = [
    np.array([1.0, 2.0]).astype(np.float32),
    np.array([3.0, 4.0]).astype(np.float32),
    np.array([5.0, 6.0]).astype(np.float32),
]
expect(
    node,
    inputs=[node_input],
    outputs=expected_outputs,
    name="test_split_equal_parts_1d_opset18",
)

split = np.array([2, 4]).astype(np.int64)
node = onnx.helper.make_node(
    "Split",
    inputs=["input", "split"],
    outputs=["output_1", "output_2"],
    axis=0,
)

expected_outputs = [
    np.array([1.0, 2.0]).astype(np.float32),
    np.array([3.0, 4.0, 5.0, 6.0]).astype(np.float32),
]
expect(
    node,
    inputs=[node_input, split],
    outputs=expected_outputs,
    name="test_split_variable_parts_1d_opset18",
)
```

</details>


<details>
<summary>1d_uneven_split_opset18</summary>

```python
node_input = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]).astype(np.float32)

# If axis is not specified, split is applied on default axis 0
node = onnx.helper.make_node(
    "Split",
    inputs=["input"],
    outputs=["output_1", "output_2", "output_3", "output_4"],
    num_outputs=4,
)

expected_outputs = [
    np.array([1.0, 2.0]).astype(np.float32),
    np.array([3.0, 4.0]).astype(np.float32),
    np.array([5.0, 6.0]).astype(np.float32),
    np.array([7.0]).astype(np.float32),
]
expect(
    node,
    inputs=[node_input],
    outputs=expected_outputs,
    name="test_split_1d_uneven_split_opset18",
)
```

</details>


<details>
<summary>2d_opset13</summary>

```python
node_input = np.array(
    [[1.0, 2.0, 3.0, 4.0, 5.0, 6.0], [7.0, 8.0, 9.0, 10.0, 11.0, 12.0]]
).astype(np.float32)

node = onnx.helper.make_node(
    "Split", inputs=["input"], outputs=["output_1", "output_2"], axis=1
)

expected_outputs = [
    np.array([[1.0, 2.0, 3.0], [7.0, 8.0, 9.0]]).astype(np.float32),
    np.array([[4.0, 5.0, 6.0], [10.0, 11.0, 12.0]]).astype(np.float32),
]

expect(
    node,
    inputs=[node_input],
    outputs=expected_outputs,
    name="test_split_equal_parts_2d_opset13",
    opset_imports=[onnx.helper.make_opsetid("", 13)],
)

split = np.array([2, 4]).astype(np.int64)
node = onnx.helper.make_node(
    "Split",
    inputs=["input", "split"],
    outputs=["output_1", "output_2"],
    axis=1,
)

expected_outputs = [
    np.array([[1.0, 2.0], [7.0, 8.0]]).astype(np.float32),
    np.array([[3.0, 4.0, 5.0, 6.0], [9.0, 10.0, 11.0, 12.0]]).astype(
        np.float32
    ),
]

expect(
    node,
    inputs=[node_input, split],
    outputs=expected_outputs,
    name="test_split_variable_parts_2d_opset13",
    opset_imports=[onnx.helper.make_opsetid("", 13)],
)
```

</details>


<details>
<summary>2d_opset18</summary>

```python
node_input = np.array(
    [[1.0, 2.0, 3.0, 4.0, 5.0, 6.0], [7.0, 8.0, 9.0, 10.0, 11.0, 12.0]]
).astype(np.float32)

node = onnx.helper.make_node(
    "Split",
    inputs=["input"],
    outputs=["output_1", "output_2"],
    axis=1,
    num_outputs=2,
)

expected_outputs = [
    np.array([[1.0, 2.0, 3.0], [7.0, 8.0, 9.0]]).astype(np.float32),
    np.array([[4.0, 5.0, 6.0], [10.0, 11.0, 12.0]]).astype(np.float32),
]

expect(
    node,
    inputs=[node_input],
    outputs=expected_outputs,
    name="test_split_equal_parts_2d",
)

split = np.array([2, 4]).astype(np.int64)
node = onnx.helper.make_node(
    "Split",
    inputs=["input", "split"],
    outputs=["output_1", "output_2"],
    axis=1,
)

expected_outputs = [
    np.array([[1.0, 2.0], [7.0, 8.0]]).astype(np.float32),
    np.array([[3.0, 4.0, 5.0, 6.0], [9.0, 10.0, 11.0, 12.0]]).astype(
        np.float32
    ),
]

expect(
    node,
    inputs=[node_input, split],
    outputs=expected_outputs,
    name="test_split_variable_parts_2d_opset18",
)
```

</details>


<details>
<summary>2d_uneven_split_opset18</summary>

```python
node_input = np.array(
    [
        [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0],
        [9.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0],
    ]
).astype(np.float32)

node = onnx.helper.make_node(
    "Split",
    inputs=["input"],
    outputs=["output_1", "output_2", "output_3"],
    axis=1,
    num_outputs=3,
)

expected_outputs = [
    np.array([[1.0, 2.0, 3.0], [9.0, 10.0, 11.0]]).astype(np.float32),
    np.array([[4.0, 5.0, 6.0], [12.0, 13.0, 14.0]]).astype(np.float32),
    np.array([[7.0, 8.0], [15.0, 16.0]]).astype(np.float32),
]

expect(
    node,
    inputs=[node_input],
    outputs=expected_outputs,
    name="test_split_2d_uneven_split_opset18",
)
```

</details>


<details>
<summary>default_values_opset13</summary>

```python
node_input = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0]).astype(np.float32)

# If axis is not specified, split is applied on default axis 0
node = onnx.helper.make_node(
    "Split", inputs=["input"], outputs=["output_1", "output_2", "output_3"]
)

expected_outputs = [
    np.array([1.0, 2.0]).astype(np.float32),
    np.array([3.0, 4.0]).astype(np.float32),
    np.array([5.0, 6.0]).astype(np.float32),
]
expect(
    node,
    inputs=[node_input],
    outputs=expected_outputs,
    name="test_split_equal_parts_default_axis_opset13",
    opset_imports=[onnx.helper.make_opsetid("", 13)],
)

split = np.array([2, 4]).astype(np.int64)
node = onnx.helper.make_node(
    "Split", inputs=["input", "split"], outputs=["output_1", "output_2"]
)

expected_outputs = [
    np.array([1.0, 2.0]).astype(np.float32),
    np.array([3.0, 4.0, 5.0, 6.0]).astype(np.float32),
]
expect(
    node,
    inputs=[node_input, split],
    outputs=expected_outputs,
    name="test_split_variable_parts_default_axis_opset13",
    opset_imports=[onnx.helper.make_opsetid("", 13)],
)
```

</details>


<details>
<summary>default_values_opset18</summary>

```python
node_input = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0]).astype(np.float32)

# If axis is not specified, split is applied on default axis 0
node = onnx.helper.make_node(
    "Split",
    inputs=["input"],
    outputs=["output_1", "output_2", "output_3"],
    num_outputs=3,
)

expected_outputs = [
    np.array([1.0, 2.0]).astype(np.float32),
    np.array([3.0, 4.0]).astype(np.float32),
    np.array([5.0, 6.0]).astype(np.float32),
]
expect(
    node,
    inputs=[node_input],
    outputs=expected_outputs,
    name="test_split_equal_parts_default_axis_opset18",
)

split = np.array([2, 4]).astype(np.int64)
node = onnx.helper.make_node(
    "Split", inputs=["input", "split"], outputs=["output_1", "output_2"]
)

expected_outputs = [
    np.array([1.0, 2.0]).astype(np.float32),
    np.array([3.0, 4.0, 5.0, 6.0]).astype(np.float32),
]
expect(
    node,
    inputs=[node_input, split],
    outputs=expected_outputs,
    name="test_split_variable_parts_default_axis_opset18",
)
```

</details>


<details>
<summary>zero_size_splits_opset13</summary>

```python
# 1-dimensional tensor with dimension_size=0
node_input = np.array([]).astype(np.float32)

# Split emtpy tensor to tensors of size zero
split = np.array([0, 0, 0]).astype(np.int64)
node = onnx.helper.make_node(
    "Split",
    inputs=["input", "split"],
    outputs=["output_1", "output_2", "output_3"],
)

expected_outputs = [
    np.array([]).astype(np.float32),
    np.array([]).astype(np.float32),
    np.array([]).astype(np.float32),
]
expect(
    node,
    inputs=[node_input, split],
    outputs=expected_outputs,
    name="test_split_zero_size_splits_opset13",
    opset_imports=[onnx.helper.make_opsetid("", 13)],
)
```

</details>


<details>
<summary>zero_size_splits_opset18</summary>

```python
# 1-dimensional tensor with dimension_size=0
node_input = np.array([]).astype(np.float32)

# Split emtpy tensor to tensors of size zero
split = np.array([0, 0, 0]).astype(np.int64)
node = onnx.helper.make_node(
    "Split",
    inputs=["input", "split"],
    outputs=["output_1", "output_2", "output_3"],
)

expected_outputs = [
    np.array([]).astype(np.float32),
    np.array([]).astype(np.float32),
    np.array([]).astype(np.float32),
]
expect(
    node,
    inputs=[node_input, split],
    outputs=expected_outputs,
    name="test_split_zero_size_splits_opset18",
)
```

</details>


### <a name="SplitToSequence"></a><a name="splittosequence">**SplitToSequence**</a>

  Split a tensor into a sequence of tensors, along the specified 'axis'.
  Lengths of the parts can be specified using the optional argument 'split'.
  If the argument `split' is not specified, a default scalar value of 1
  is used as the value of `split'.
  'split' must contain only positive numbers.
  'split' is either a scalar (tensor of empty shape), or a 1-D tensor.
  If 'split' is a scalar, then 'input' will be split into chunks all of size 'split'
  if possible. The last chunk alone may be smaller than 'split' if the 'input' size
  along the given axis 'axis' is not divisible by 'split'.
  If 'split' is a 1-dimensional tensor, the input tensor is split into 'size(split)' chunks,
  with lengths of the parts on 'axis' specified in 'split'. In this scenario, the sum of entries
  in 'split' must be equal to the dimension size of input tensor on 'axis'.

#### Version

This version of the operator has been available since version 11 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int (default is 0)</dt>
<dd>Which axis to split on. A negative value means counting dimensions from the back. Accepted range is [-rank, rank-1].</dd>
<dt><tt>keepdims</tt> : int (default is 1)</dt>
<dd>Keep the split dimension or not. Default 1, which means we keep split dimension. If input 'split' is specified, this attribute is ignored.</dd>
</dl>

#### Inputs (1 - 2)

<dl>
<dt><tt>input</tt> : T</dt>
<dd>The tensor to split</dd>
<dt><tt>split</tt> (optional) : I</dt>
<dd>Length of each output. It can be either a scalar(tensor of empty shape), or a 1-D tensor. All values must be >= 0. </dd>
</dl>

#### Outputs

<dl>
<dt><tt>output_sequence</tt> : S</dt>
<dd>One or more outputs forming a sequence of tensors after splitting</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain input types to all tensor types.</dd>
<dt><tt>I</tt> : tensor(int32), tensor(int64)</dt>
<dd>Constrain split size to integral tensor.</dd>
<dt><tt>S</tt> : seq(tensor(uint8)), seq(tensor(uint16)), seq(tensor(uint32)), seq(tensor(uint64)), seq(tensor(int8)), seq(tensor(int16)), seq(tensor(int32)), seq(tensor(int64)), seq(tensor(float16)), seq(tensor(float)), seq(tensor(double)), seq(tensor(string)), seq(tensor(bool)), seq(tensor(complex64)), seq(tensor(complex128))</dt>
<dd>Constrain output types to all tensor types.</dd>
</dl>


#### Examples

<details>
<summary>nokeepdims</summary>

```python
data = np.arange(18).reshape((3, 6)).astype(np.float32)

node = onnx.helper.make_node(
    "SplitToSequence",
    ["data"],
    ["seq"],
    axis=1,
    keepdims=0,
)

expected_outputs = [[data[:, i] for i in range(data.shape[1])]]

expect(
    node,
    inputs=[data],
    outputs=expected_outputs,
    name="test_split_to_sequence_nokeepdims",
)
```

</details>


<details>
<summary>with_split_1</summary>

```python
data = np.arange(18).reshape((3, 6)).astype(np.float32)
split = np.array(2, dtype=np.int64)

node = onnx.helper.make_node(
    "SplitToSequence", ["data", "split"], ["seq"], axis=1
)

expected_outputs = [
    [
        np.array([[0.0, 1.0], [6.0, 7.0], [12.0, 13.0]], dtype=np.float32),
        np.array([[2.0, 3.0], [8.0, 9.0], [14.0, 15.0]], dtype=np.float32),
        np.array([[4.0, 5.0], [10.0, 11.0], [16.0, 17.0]], dtype=np.float32),
    ]
]

expect(
    node,
    inputs=[data, split],
    outputs=expected_outputs,
    name="test_split_to_sequence_1",
)
```

</details>


<details>
<summary>with_split_2</summary>

```python
data = np.arange(18).reshape((3, 6)).astype(np.float32)
split = np.array([1, 2], dtype=np.int64)

node = onnx.helper.make_node(
    "SplitToSequence", ["data", "split"], ["seq"], axis=0
)

expected_outputs = [
    [
        data[:1],
        data[1:],
    ]
]

expect(
    node,
    inputs=[data, split],
    outputs=expected_outputs,
    name="test_split_to_sequence_2",
)
```

</details>


### <a name="Sqrt"></a><a name="sqrt">**Sqrt**</a>

  Square root takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the square root is, y = x^0.5, is applied to
  the tensor elementwise. If x is negative, then it will return NaN.

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Sqrt-1">1</a>, <a href="Changelog.md#Sqrt-6">6</a>

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>sqrt</summary>

```python
node = onnx.helper.make_node(
    "Sqrt",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([1, 4, 9]).astype(np.float32)
y = np.sqrt(x)  # expected output [1., 2., 3.]
expect(node, inputs=[x], outputs=[y], name="test_sqrt_example")

x = np.abs(np.random.randn(3, 4, 5).astype(np.float32))
y = np.sqrt(x)
expect(node, inputs=[x], outputs=[y], name="test_sqrt")
```

</details>


### <a name="Squeeze"></a><a name="squeeze">**Squeeze**</a>

  Remove single-dimensional entries from the shape of a tensor.
  Takes an input `axes` with a list of axes to squeeze.
  If `axes` is not provided, all the single dimensions will be removed from
  the shape. If an axis is selected with shape entry not equal to one, an error is raised.

#### Version

This version of the operator has been available since version 21 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Squeeze-1">1</a>, <a href="Changelog.md#Squeeze-11">11</a>, <a href="Changelog.md#Squeeze-13">13</a>

#### Inputs (1 - 2)

<dl>
<dt><tt>data</tt> (differentiable) : T</dt>
<dd>Tensors with at least max(dims) dimensions.</dd>
<dt><tt>axes</tt> (optional, non-differentiable) : tensor(int64)</dt>
<dd>List of integers indicating the dimensions to squeeze. Negative value means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(data).</dd>
</dl>

#### Outputs

<dl>
<dt><tt>squeezed</tt> (differentiable) : T</dt>
<dd>Reshaped tensor with same data as input.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4)</dt>
<dd>Constrain input and output types to all tensor types up to IRv10.</dd>
</dl>


#### Examples

<details>
<summary>squeeze</summary>

```python
node = onnx.helper.make_node(
    "Squeeze",
    inputs=["x", "axes"],
    outputs=["y"],
)
x = np.random.randn(1, 3, 4, 5).astype(np.float32)
axes = np.array([0], dtype=np.int64)
y = np.squeeze(x, axis=0)

expect(node, inputs=[x, axes], outputs=[y], name="test_squeeze")
```

</details>


<details>
<summary>squeeze_negative_axes</summary>

```python
node = onnx.helper.make_node(
    "Squeeze",
    inputs=["x", "axes"],
    outputs=["y"],
)
x = np.random.randn(1, 3, 1, 5).astype(np.float32)
axes = np.array([-2], dtype=np.int64)
y = np.squeeze(x, axis=-2)
expect(node, inputs=[x, axes], outputs=[y], name="test_squeeze_negative_axes")
```

</details>


### <a name="StringConcat"></a><a name="stringconcat">**StringConcat**</a>

  StringConcat concatenates string tensors elementwise (with NumPy-style broadcasting support)

#### Version

This version of the operator has been available since version 20 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>X</tt> (non-differentiable) : T</dt>
<dd>Tensor to prepend in concatenation</dd>
<dt><tt>Y</tt> (non-differentiable) : T</dt>
<dd>Tensor to append in concatenation</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Z</tt> (non-differentiable) : T</dt>
<dd>Concatenated string tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(string)</dt>
<dd>Inputs and outputs must be UTF-8 strings</dd>
</dl>


#### Examples

<details>
<summary>stringconcat</summary>

```python
node = onnx.helper.make_node(
    "StringConcat",
    inputs=["x", "y"],
    outputs=["result"],
)
x = np.array(["abc", "def"]).astype("object")
y = np.array([".com", ".net"]).astype("object")
result = np.array(["abc.com", "def.net"]).astype("object")

expect(node, inputs=[x, y], outputs=[result], name="test_string_concat")

x = np.array(["cat", "dog", "snake"]).astype("object")
y = np.array(["s"]).astype("object")
result = np.array(["cats", "dogs", "snakes"]).astype("object")

expect(
    node,
    inputs=[x, y],
    outputs=[result],
    name="test_string_concat_broadcasting",
)

x = np.array("cat").astype("object")
y = np.array("s").astype("object")
result = np.array("cats").astype("object")

expect(
    node,
    inputs=[x, y],
    outputs=[result],
    name="test_string_concat_zero_dimensional",
)

x = np.array(["abc", ""]).astype("object")
y = np.array(["", "abc"]).astype("object")
result = np.array(["abc", "abc"]).astype("object")

expect(
    node,
    inputs=[x, y],
    outputs=[result],
    name="test_string_concat_empty_string",
)

x = np.array(["的", "中"]).astype("object")
y = np.array(["的", "中"]).astype("object")
result = np.array(["的的", "中中"]).astype("object")

expect(
    node,
    inputs=[x, y],
    outputs=[result],
    name="test_string_concat_utf8",
)
```

</details>


### <a name="StringNormalizer"></a><a name="stringnormalizer">**StringNormalizer**</a>

  StringNormalization performs string operations for basic cleaning.
  This operator has only one input (denoted by X) and only one output
  (denoted by Y). This operator first examines the elements in the X,
  and removes elements specified in "stopwords" attribute.
  After removing stop words, the intermediate result can be further lowercased,
  uppercased, or just returned depending the "case_change_action" attribute.
  This operator only accepts [C]- and [1, C]-tensor.
  If all elements in X are dropped, the output will be the empty value of string tensor with shape [1]
  if input shape is [C] and shape [1, 1] if input shape is [1, C].

#### Version

This version of the operator has been available since version 10 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>case_change_action</tt> : string (default is NONE)</dt>
<dd>string enum that cases output to be lowercased/uppercases/unchanged. Valid values are "LOWER", "UPPER", "NONE". Default is "NONE"</dd>
<dt><tt>is_case_sensitive</tt> : int (default is 0)</dt>
<dd>Boolean. Whether the identification of stop words in X is case-sensitive. Default is false</dd>
<dt><tt>locale</tt> : string</dt>
<dd>Environment dependent string that denotes the locale according to which output strings needs to be upper/lowercased.Default en_US or platform specific equivalent as decided by the implementation.</dd>
<dt><tt>stopwords</tt> : list of strings</dt>
<dd>List of stop words. If not set, no word would be removed from X.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> : tensor(string)</dt>
<dd>UTF-8 strings to normalize</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> : tensor(string)</dt>
<dd>UTF-8 Normalized strings</dd>
</dl>

#### Type Constraints



#### Examples

<details>
<summary>monday_casesensintive_lower</summary>

```python
input = np.array(["monday", "tuesday", "wednesday", "thursday"]).astype(object)
output = np.array(["tuesday", "wednesday", "thursday"]).astype(object)
stopwords = ["monday"]

node = onnx.helper.make_node(
    "StringNormalizer",
    inputs=["x"],
    outputs=["y"],
    case_change_action="LOWER",
    is_case_sensitive=1,
    stopwords=stopwords,
)
expect(
    node,
    inputs=[input],
    outputs=[output],
    name="test_strnormalizer_export_monday_casesensintive_lower",
)
```

</details>


<details>
<summary>monday_casesensintive_nochangecase</summary>

```python
input = np.array(["monday", "tuesday", "wednesday", "thursday"]).astype(object)
output = np.array(["tuesday", "wednesday", "thursday"]).astype(object)
stopwords = ["monday"]

node = onnx.helper.make_node(
    "StringNormalizer",
    inputs=["x"],
    outputs=["y"],
    is_case_sensitive=1,
    stopwords=stopwords,
)
expect(
    node,
    inputs=[input],
    outputs=[output],
    name="test_strnormalizer_export_monday_casesensintive_nochangecase",
)
```

</details>


<details>
<summary>monday_casesensintive_upper</summary>

```python
input = np.array(["monday", "tuesday", "wednesday", "thursday"]).astype(object)
output = np.array(["TUESDAY", "WEDNESDAY", "THURSDAY"]).astype(object)
stopwords = ["monday"]

node = onnx.helper.make_node(
    "StringNormalizer",
    inputs=["x"],
    outputs=["y"],
    case_change_action="UPPER",
    is_case_sensitive=1,
    stopwords=stopwords,
)
expect(
    node,
    inputs=[input],
    outputs=[output],
    name="test_strnormalizer_export_monday_casesensintive_upper",
)
```

</details>


<details>
<summary>monday_empty_output</summary>

```python
input = np.array(["monday", "monday"]).astype(object)
output = np.array([""]).astype(object)
stopwords = ["monday"]

node = onnx.helper.make_node(
    "StringNormalizer",
    inputs=["x"],
    outputs=["y"],
    case_change_action="UPPER",
    is_case_sensitive=1,
    stopwords=stopwords,
)
expect(
    node,
    inputs=[input],
    outputs=[output],
    name="test_strnormalizer_export_monday_empty_output",
)
```

</details>


<details>
<summary>monday_insensintive_upper_twodim</summary>

```python
input = (
    np.array(
        ["Monday", "tuesday", "wednesday", "Monday", "tuesday", "wednesday"]
    )
    .astype(object)
    .reshape([1, 6])
)

# It does upper case cecedille, accented E
# and german umlaut but fails
# with german eszett
output = (
    np.array(["TUESDAY", "WEDNESDAY", "TUESDAY", "WEDNESDAY"])
    .astype(object)
    .reshape([1, 4])
)
stopwords = ["monday"]

node = onnx.helper.make_node(
    "StringNormalizer",
    inputs=["x"],
    outputs=["y"],
    case_change_action="UPPER",
    stopwords=stopwords,
)
expect(
    node,
    inputs=[input],
    outputs=[output],
    name="test_strnormalizer_export_monday_insensintive_upper_twodim",
)
```

</details>


<details>
<summary>nostopwords_nochangecase</summary>

```python
input = np.array(["monday", "tuesday"]).astype(object)
output = input

# No stopwords. This is a NOOP
node = onnx.helper.make_node(
    "StringNormalizer",
    inputs=["x"],
    outputs=["y"],
    is_case_sensitive=1,
)
expect(
    node,
    inputs=[input],
    outputs=[output],
    name="test_strnormalizer_nostopwords_nochangecase",
)
```

</details>


### <a name="StringSplit"></a><a name="stringsplit">**StringSplit**</a>

  StringSplit splits a string tensor's elements into substrings based on a delimiter attribute and a maxsplit attribute.

  The first output of this operator is a tensor of strings representing the substrings from splitting each input string on the `delimiter` substring. This tensor has one additional rank compared to the input tensor in order to store the substrings for each input element (where the input tensor is not empty). Note that, in order to ensure the same number of elements are present in the final dimension, this tensor will pad empty strings as illustrated in the examples below. Consecutive delimiters are not grouped together and are deemed to delimit empty strings, except if the `delimiter` is unspecified or is the empty string (""). In the case where the `delimiter` is unspecified or the empty string, consecutive whitespace characters are regarded as a single separator and leading or trailing whitespace is removed in the output.

  The second output tensor represents the number of substrings generated. `maxsplit` can be used to limit the number of splits performed - after the `maxsplit`th split if the string is not fully split, the trailing suffix of input string after the final split point is also added. For elements where fewer splits are possible than specified in `maxsplit`, it has no effect.

#### Version

This version of the operator has been available since version 20 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>delimiter</tt> : string</dt>
<dd>Delimiter to split on. If left unset or set to the empty string (""), the input is split on consecutive whitespace.</dd>
<dt><tt>maxsplit</tt> : int</dt>
<dd>Maximum number of splits (from left to right). If left unset (or if the number of possible splits are less than maxsplit), it will make as many splits as possible. Note that the maximum possible number of substrings returned with `maxsplit` specified is `maxsplit+1` since the remaining suffix after the `maxsplit`th split is included in the output.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> (non-differentiable) : T1</dt>
<dd>Tensor of strings to split.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (non-differentiable) : T2</dt>
<dd>Tensor of substrings representing the outcome of splitting the strings in the input on the delimiter. Note that to ensure the same number of elements are present in the final rank, this tensor will pad any necessary empty strings.</dd>
<dt><tt>Z</tt> (non-differentiable) : T3</dt>
<dd>The number of substrings generated for each input element.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(string)</dt>
<dd>The input must be a UTF-8 string tensor</dd>
<dt><tt>T2</tt> : tensor(string)</dt>
<dd>Tensor of substrings.</dd>
<dt><tt>T3</tt> : tensor(int64)</dt>
<dd>The number of substrings generated.</dd>
</dl>


#### Examples

<details>
<summary>basic</summary>

```python
node = onnx.helper.make_node(
    "StringSplit",
    inputs=["x"],
    outputs=["substrings", "length"],
    delimiter=".",
    maxsplit=None,
)

x = np.array(["abc.com", "def.net"]).astype(object)

substrings = np.array([["abc", "com"], ["def", "net"]]).astype(object)

length = np.array([2, 2], dtype=np.int64)

expect(
    node,
    inputs=[x],
    outputs=[substrings, length],
    name="test_string_split_basic",
)
```

</details>


<details>
<summary>consecutive_delimiters</summary>

```python
node = onnx.helper.make_node(
    "StringSplit",
    inputs=["x"],
    outputs=["substrings", "length"],
    delimiter="-",
    maxsplit=None,
)

x = np.array(["o-n-n--x-", "o-n----nx"]).astype(object)

substrings = np.array(
    [["o", "n", "n", "", "x", ""], ["o", "n", "", "", "", "nx"]]
).astype(object)

length = np.array([6, 6], dtype=np.int64)

expect(
    node,
    inputs=[x],
    outputs=[substrings, length],
    name="test_string_split_consecutive_delimiters",
)
```

</details>


<details>
<summary>empty_string_delimiter</summary>

```python
for delimiter, test_name in (
    ("", "test_string_split_empty_string_delimiter"),
    (None, "test_string_split_no_delimiter"),
):
    node = onnx.helper.make_node(
        "StringSplit",
        inputs=["x"],
        outputs=["substrings", "length"],
        delimiter=delimiter,
        maxsplit=None,
    )

    x = np.array(
        ["hello world !", "  hello   world !", " hello world   ! "]
    ).astype(object)

    substrings = np.array(
        [
            ["hello", "world", "!"],
            ["hello", "world", "!"],
            ["hello", "world", "!"],
        ]
    ).astype(object)

    length = np.array([3, 3, 3], dtype=np.int64)

    expect(
        node,
        inputs=[x],
        outputs=[substrings, length],
        name=test_name,
    )
```

</details>


<details>
<summary>empty_string_split</summary>

```python
node = onnx.helper.make_node(
    "StringSplit",
    inputs=["x"],
    outputs=["substrings", "length"],
    delimiter=None,
    maxsplit=None,
)

x = np.array([]).astype(object)

substrings = np.array([]).astype(object).reshape(0, 0)

length = np.array([], dtype=np.int64)

expect(
    node,
    inputs=[x],
    outputs=[substrings, length],
    name="test_string_split_empty_tensor",
    output_type_protos=[
        onnx.helper.make_tensor_type_proto(onnx.TensorProto.STRING, (0, None)),
        None,
    ],
)
```

</details>


<details>
<summary>maxsplit</summary>

```python
node = onnx.helper.make_node(
    "StringSplit",
    inputs=["x"],
    outputs=["substrings", "length"],
    maxsplit=2,
)

x = np.array(
    [["hello world", "def.net"], ["o n n x", "the quick brown fox"]]
).astype(object)

substrings = np.array(
    [
        [["hello", "world", ""], ["def.net", "", ""]],
        [["o", "n", "n x"], ["the", "quick", "brown fox"]],
    ]
).astype(object)

length = np.array([[2, 1], [3, 3]], np.int64)

expect(
    node,
    inputs=[x],
    outputs=[substrings, length],
    name="test_string_split_maxsplit",
)
```

</details>


### <a name="Sub"></a><a name="sub">**Sub**</a>

  Performs element-wise binary subtraction (with Numpy-style broadcasting support).

  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

  (Opset 14 change): Extend supported types to include uint8, int8, uint16, and int16.

#### Version

This version of the operator has been available since version 14 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Sub-1">1</a>, <a href="Changelog.md#Sub-6">6</a>, <a href="Changelog.md#Sub-7">7</a>, <a href="Changelog.md#Sub-13">13</a>

#### Inputs

<dl>
<dt><tt>A</tt> (differentiable) : T</dt>
<dd>First operand.</dd>
<dt><tt>B</tt> (differentiable) : T</dt>
<dd>Second operand.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> (differentiable) : T</dt>
<dd>Result, has same element type as two inputs</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to all numeric tensors.</dd>
</dl>


#### Examples

<details>
<summary>sub</summary>

```python
node = onnx.helper.make_node(
    "Sub",
    inputs=["x", "y"],
    outputs=["z"],
)

x = np.array([1, 2, 3]).astype(np.float32)
y = np.array([3, 2, 1]).astype(np.float32)
z = x - y  # expected output [-2., 0., 2.]
expect(node, inputs=[x, y], outputs=[z], name="test_sub_example")

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.randn(3, 4, 5).astype(np.float32)
z = x - y
expect(node, inputs=[x, y], outputs=[z], name="test_sub")

x = np.random.randint(12, 24, size=(3, 4, 5), dtype=np.uint8)
y = np.random.randint(12, size=(3, 4, 5), dtype=np.uint8)
z = x - y
expect(node, inputs=[x, y], outputs=[z], name="test_sub_uint8")
```

</details>


<details>
<summary>sub_broadcast</summary>

```python
node = onnx.helper.make_node(
    "Sub",
    inputs=["x", "y"],
    outputs=["z"],
)

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.random.randn(5).astype(np.float32)
z = x - y
expect(node, inputs=[x, y], outputs=[z], name="test_sub_bcast")
```

</details>


### <a name="Sum"></a><a name="sum">**Sum**</a>

  Element-wise sum of each of the input tensors (with Numpy-style broadcasting support).
  All inputs and outputs must have the same data type.
  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Sum-1">1</a>, <a href="Changelog.md#Sum-6">6</a>, <a href="Changelog.md#Sum-8">8</a>

#### Inputs (1 - &#8734;)

<dl>
<dt><tt>data_0</tt> (variadic, differentiable) : T</dt>
<dd>List of tensors for sum.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>sum</tt> (differentiable) : T</dt>
<dd>Output tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>sum</summary>

```python
data_0 = np.array([3, 0, 2]).astype(np.float32)
data_1 = np.array([1, 3, 4]).astype(np.float32)
data_2 = np.array([2, 6, 6]).astype(np.float32)
result = np.array([6, 9, 12]).astype(np.float32)
node = onnx.helper.make_node(
    "Sum",
    inputs=["data_0", "data_1", "data_2"],
    outputs=["result"],
)
expect(
    node,
    inputs=[data_0, data_1, data_2],
    outputs=[result],
    name="test_sum_example",
)

node = onnx.helper.make_node(
    "Sum",
    inputs=["data_0"],
    outputs=["result"],
)
expect(node, inputs=[data_0], outputs=[data_0], name="test_sum_one_input")

result = np.add(data_0, data_1)
node = onnx.helper.make_node(
    "Sum",
    inputs=["data_0", "data_1"],
    outputs=["result"],
)
expect(
    node, inputs=[data_0, data_1], outputs=[result], name="test_sum_two_inputs"
)
```

</details>


### <a name="Tan"></a><a name="tan">**Tan**</a>

  Calculates the tangent of the given input tensor, element-wise.

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>The tangent of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>tan</summary>

```python
node = onnx.helper.make_node(
    "Tan",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([-1, 0, 1]).astype(np.float32)
y = np.tan(x)
expect(node, inputs=[x], outputs=[y], name="test_tan_example")

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.tan(x)
expect(node, inputs=[x], outputs=[y], name="test_tan")
```

</details>


### <a name="Tanh"></a><a name="tanh">**Tanh**</a>

  Calculates the hyperbolic tangent of the given input tensor element-wise.

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Tanh-1">1</a>, <a href="Changelog.md#Tanh-6">6</a>

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>The hyperbolic tangent values of the input tensor computed element-wise</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double), tensor(bfloat16)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>tanh</summary>

```python
node = onnx.helper.make_node(
    "Tanh",
    inputs=["x"],
    outputs=["y"],
)

x = np.array([-1, 0, 1]).astype(np.float32)
y = np.tanh(x)  # expected output [-0.76159418, 0., 0.76159418]
expect(node, inputs=[x], outputs=[y], name="test_tanh_example")

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.tanh(x)
expect(node, inputs=[x], outputs=[y], name="test_tanh")
```

</details>


### <a name="TfIdfVectorizer"></a><a name="tfidfvectorizer">**TfIdfVectorizer**</a>

  This transform extracts n-grams from the input sequence and save them as a vector. Input can
  be either a 1-D or 2-D tensor. For 1-D input, output is the n-gram representation of that input.
  For 2-D input, the output is also a  2-D tensor whose i-th row is the n-gram representation of the i-th input row.
  More specifically, if input shape is [C], the corresponding output shape would be [max(ngram_indexes) + 1].
  If input shape is [N, C], this operator produces a [N, max(ngram_indexes) + 1]-tensor.

  In contrast to standard n-gram extraction, here, the indexes of extracting an n-gram from the original
  sequence are not necessarily consecutive numbers. The discontinuity between indexes are controlled by the number of skips.
  If the number of skips is 2, we should skip two tokens when scanning through the original sequence.
  Let's consider an example. Assume that input sequence is [94, 17, 36, 12, 28] and the number of skips is 2.
  The associated 2-grams are [94, 12] and [17, 28] respectively indexed by [0, 3] and [1, 4].
  If the number of skips becomes 0, the 2-grams generated are [94, 17], [17, 36], [36, 12], [12, 28]
  indexed by [0, 1], [1, 2], [2, 3], [3, 4], respectively.

  The output vector (denoted by Y) stores the count of each n-gram;
  Y[ngram_indexes[i]] indicates the times that the i-th n-gram is found. The attribute ngram_indexes is used to determine the mapping
  between index i and the corresponding n-gram's output coordinate. If pool_int64s is [94, 17, 17, 36], ngram_indexes is [1, 0],
  ngram_counts=[0, 0], then the Y[0] (first element in Y) and Y[1] (second element in Y) are the counts of [17, 36] and [94, 17],
  respectively. An n-gram which cannot be found in pool_strings/pool_int64s should be ignored and has no effect on the output.
  Note that we may consider all skips up to S when generating the n-grams.

  The examples used above are true if mode is "TF". If mode is "IDF", all the counts larger than 1 would be truncated to 1 and
  the i-th element in weights would be used to scale (by multiplication) the count of the i-th n-gram in pool. If mode is "TFIDF",
  this operator first computes the counts of all n-grams and then scale them by the associated values in the weights attribute.

  Only one of pool_strings and pool_int64s can be set. If pool_int64s is set, the input should be an integer tensor.
  If pool_strings is set, the input must be a string tensor.

#### Version

This version of the operator has been available since version 9 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>max_gram_length</tt> : int (required)</dt>
<dd>Maximum n-gram length. If this value is 3, 3-grams will be used to generate the output.</dd>
<dt><tt>max_skip_count</tt> : int (required)</dt>
<dd>Maximum number of items (integers/strings) to be skipped when constructing an n-gram from X. If max_skip_count=1, min_gram_length=2, max_gram_length=3, this operator may generate 2-grams with skip_count=0 and skip_count=1, and 3-grams with skip_count=0 and skip_count=1</dd>
<dt><tt>min_gram_length</tt> : int (required)</dt>
<dd>Minimum n-gram length. If this value is 2 and max_gram_length is 3, output may contain counts of 2-grams and 3-grams.</dd>
<dt><tt>mode</tt> : string (required)</dt>
<dd>The weighting criteria. It can be one of "TF" (term frequency), "IDF" (inverse document frequency), and "TFIDF" (the combination of TF and IDF)</dd>
<dt><tt>ngram_counts</tt> : list of ints (required)</dt>
<dd>The starting indexes of 1-grams, 2-grams, and so on in pool. It is useful when determining the boundary between two consecutive collections of n-grams. For example, if ngram_counts is [0, 17, 36], the first index (zero-based) of 1-gram/2-gram/3-gram in pool are 0/17/36. This format is essentially identical to CSR (or CSC) sparse matrix format, and we choose to use this due to its popularity.</dd>
<dt><tt>ngram_indexes</tt> : list of ints (required)</dt>
<dd>list of int64s (type: AttributeProto::INTS). This list is parallel to the specified 'pool_*' attribute. The i-th element in ngram_indexes indicate the coordinate of the i-th n-gram in the output tensor.</dd>
<dt><tt>pool_int64s</tt> : list of ints</dt>
<dd>List of int64 n-grams learned from the training set. Either this or pool_strings attributes must be present but not both. It's an 1-D tensor starting with the collections of all 1-grams and ending with the collections of n-grams. The i-th element in pool stores the n-gram that should be mapped to coordinate ngram_indexes[i] in the output vector.</dd>
<dt><tt>pool_strings</tt> : list of strings</dt>
<dd>List of strings n-grams learned from the training set. Either this or pool_int64s attributes must be present but not both. It's an 1-D tensor starting with the collections of all 1-grams and ending with the collections of n-grams. The i-th element in pool stores the n-gram that should be mapped to coordinate ngram_indexes[i] in the output vector.</dd>
<dt><tt>weights</tt> : list of floats</dt>
<dd>list of floats. This attribute stores the weight of each n-gram in pool. The i-th element in weights is the weight of the i-th n-gram in pool. Its length equals to the size of ngram_indexes. By default, weights is an all-one tensor.This attribute is used when mode is "IDF" or "TFIDF" to scale the associated word counts.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> (non-differentiable) : T</dt>
<dd>Input for n-gram extraction</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (non-differentiable) : T1</dt>
<dd>Ngram results</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(string), tensor(int32), tensor(int64)</dt>
<dd>Input is ether string UTF-8 or int32/int64</dd>
<dt><tt>T1</tt> : tensor(float)</dt>
<dd>1-D tensor of floats</dd>
</dl>


#### Examples

<details>
<summary>tf_batch_onlybigrams_skip0</summary>

```python
input = np.array([[1, 1, 3, 3, 3, 7], [8, 6, 7, 5, 6, 8]]).astype(np.int32)
output = np.array(
    [[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0]]
).astype(np.float32)

ngram_counts = np.array([0, 4]).astype(np.int64)
ngram_indexes = np.array([0, 1, 2, 3, 4, 5, 6]).astype(np.int64)
pool_int64s = np.array([2, 3, 5, 4, 5, 6, 7, 8, 6, 7]).astype(  # unigrams
    np.int64
)  # bigrams

helper = TfIdfVectorizerHelper(
    mode="TF",
    min_gram_length=2,
    max_gram_length=2,
    max_skip_count=0,
    ngram_counts=ngram_counts,
    ngram_indexes=ngram_indexes,
    pool_int64s=pool_int64s,
)
node = helper.make_node_noweights()
expect(
    node,
    inputs=[input],
    outputs=[output],
    name="test_tfidfvectorizer_tf_batch_onlybigrams_skip0",
)
```

</details>


<details>
<summary>tf_batch_onlybigrams_skip5</summary>

```python
input = np.array([[1, 1, 3, 3, 3, 7], [8, 6, 7, 5, 6, 8]]).astype(np.int32)
output = np.array(
    [[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0]]
).astype(np.float32)

ngram_counts = np.array([0, 4]).astype(np.int64)
ngram_indexes = np.array([0, 1, 2, 3, 4, 5, 6]).astype(np.int64)
pool_int64s = np.array([2, 3, 5, 4, 5, 6, 7, 8, 6, 7]).astype(  # unigrams
    np.int64
)  # bigrams

helper = TfIdfVectorizerHelper(
    mode="TF",
    min_gram_length=2,
    max_gram_length=2,
    max_skip_count=5,
    ngram_counts=ngram_counts,
    ngram_indexes=ngram_indexes,
    pool_int64s=pool_int64s,
)
node = helper.make_node_noweights()
expect(
    node,
    inputs=[input],
    outputs=[output],
    name="test_tfidfvectorizer_tf_batch_onlybigrams_skip5",
)
```

</details>


<details>
<summary>tf_batch_uniandbigrams_skip5</summary>

```python
input = np.array([[1, 1, 3, 3, 3, 7], [8, 6, 7, 5, 6, 8]]).astype(np.int32)
output = np.array(
    [[0.0, 3.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 1.0]]
).astype(np.float32)

ngram_counts = np.array([0, 4]).astype(np.int64)
ngram_indexes = np.array([0, 1, 2, 3, 4, 5, 6]).astype(np.int64)
pool_int64s = np.array([2, 3, 5, 4, 5, 6, 7, 8, 6, 7]).astype(  # unigrams
    np.int64
)  # bigrams

helper = TfIdfVectorizerHelper(
    mode="TF",
    min_gram_length=1,
    max_gram_length=2,
    max_skip_count=5,
    ngram_counts=ngram_counts,
    ngram_indexes=ngram_indexes,
    pool_int64s=pool_int64s,
)
node = helper.make_node_noweights()
expect(
    node,
    inputs=[input],
    outputs=[output],
    name="test_tfidfvectorizer_tf_batch_uniandbigrams_skip5",
)
```

</details>


<details>
<summary>tf_only_bigrams_skip0</summary>

```python
input = np.array([1, 1, 3, 3, 3, 7, 8, 6, 7, 5, 6, 8]).astype(np.int32)
output = np.array([0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0]).astype(np.float32)

ngram_counts = np.array([0, 4]).astype(np.int64)
ngram_indexes = np.array([0, 1, 2, 3, 4, 5, 6]).astype(np.int64)
pool_int64s = np.array([2, 3, 5, 4, 5, 6, 7, 8, 6, 7]).astype(  # unigrams
    np.int64
)  # bigrams

helper = TfIdfVectorizerHelper(
    mode="TF",
    min_gram_length=2,
    max_gram_length=2,
    max_skip_count=0,
    ngram_counts=ngram_counts,
    ngram_indexes=ngram_indexes,
    pool_int64s=pool_int64s,
)
node = helper.make_node_noweights()
expect(
    node,
    inputs=[input],
    outputs=[output],
    name="test_tfidfvectorizer_tf_only_bigrams_skip0",
)
```

</details>


<details>
<summary>tf_onlybigrams_levelempty</summary>

```python
input = np.array([1, 1, 3, 3, 3, 7, 8, 6, 7, 5, 6, 8]).astype(np.int32)
output = np.array([1.0, 1.0, 1.0]).astype(np.float32)

ngram_counts = np.array([0, 0]).astype(np.int64)
ngram_indexes = np.array([0, 1, 2]).astype(np.int64)
pool_int64s = np.array([5, 6, 7, 8, 6, 7]).astype(  # unigrams none
    np.int64
)  # bigrams

helper = TfIdfVectorizerHelper(
    mode="TF",
    min_gram_length=2,
    max_gram_length=2,
    max_skip_count=0,
    ngram_counts=ngram_counts,
    ngram_indexes=ngram_indexes,
    pool_int64s=pool_int64s,
)
node = helper.make_node_noweights()
expect(
    node,
    inputs=[input],
    outputs=[output],
    name="test_tfidfvectorizer_tf_onlybigrams_levelempty",
)
```

</details>


<details>
<summary>tf_onlybigrams_skip5</summary>

```python
input = np.array([1, 1, 3, 3, 3, 7, 8, 6, 7, 5, 6, 8]).astype(np.int32)
output = np.array([0.0, 0.0, 0.0, 0.0, 1.0, 3.0, 1.0]).astype(np.float32)

ngram_counts = np.array([0, 4]).astype(np.int64)
ngram_indexes = np.array([0, 1, 2, 3, 4, 5, 6]).astype(np.int64)
pool_int64s = np.array([2, 3, 5, 4, 5, 6, 7, 8, 6, 7]).astype(  # unigrams
    np.int64
)  # bigrams

helper = TfIdfVectorizerHelper(
    mode="TF",
    min_gram_length=2,
    max_gram_length=2,
    max_skip_count=5,
    ngram_counts=ngram_counts,
    ngram_indexes=ngram_indexes,
    pool_int64s=pool_int64s,
)
node = helper.make_node_noweights()
expect(
    node,
    inputs=[input],
    outputs=[output],
    name="test_tfidfvectorizer_tf_onlybigrams_skip5",
)
```

</details>


<details>
<summary>tf_uniandbigrams_skip5</summary>

```python
input = np.array([1, 1, 3, 3, 3, 7, 8, 6, 7, 5, 6, 8]).astype(np.int32)
output = np.array([0.0, 3.0, 1.0, 0.0, 1.0, 3.0, 1.0]).astype(np.float32)

ngram_counts = np.array([0, 4]).astype(np.int64)
ngram_indexes = np.array([0, 1, 2, 3, 4, 5, 6]).astype(np.int64)
pool_int64s = np.array([2, 3, 5, 4, 5, 6, 7, 8, 6, 7]).astype(  # unigrams
    np.int64
)  # bigrams

helper = TfIdfVectorizerHelper(
    mode="TF",
    min_gram_length=1,
    max_gram_length=2,
    max_skip_count=5,
    ngram_counts=ngram_counts,
    ngram_indexes=ngram_indexes,
    pool_int64s=pool_int64s,
)
node = helper.make_node_noweights()
expect(
    node,
    inputs=[input],
    outputs=[output],
    name="test_tfidfvectorizer_tf_uniandbigrams_skip5",
)
```

</details>


### <a name="ThresholdedRelu"></a><a name="thresholdedrelu">**ThresholdedRelu**</a>

  ThresholdedRelu takes one input data (Tensor<T>) and produces one output data
  (Tensor<T>) where the rectified linear function, y = x for x > alpha, y = 0 otherwise,
  is applied to the tensor elementwise.

#### Version

This version of the operator has been available since version 10 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>alpha</tt> : float (default is 1.0)</dt>
<dd>Threshold value</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Input tensor</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>Output tensor</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>default</summary>

```python
default_alpha = 1.0
node = onnx.helper.make_node("ThresholdedRelu", inputs=["x"], outputs=["y"])
x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.clip(x, default_alpha, np.inf)
y[y == default_alpha] = 0

expect(node, inputs=[x], outputs=[y], name="test_thresholdedrelu_default")
```

</details>


<details>
<summary>thresholdedrelu</summary>

```python
alpha = 2.0
node = onnx.helper.make_node(
    "ThresholdedRelu", inputs=["x"], outputs=["y"], alpha=alpha
)

x = np.array([-1.5, 0.0, 1.2, 2.0, 2.2]).astype(np.float32)
y = np.clip(x, alpha, np.inf)  # expected output [0., 0., 0., 0., 2.2]
y[y == alpha] = 0

expect(node, inputs=[x], outputs=[y], name="test_thresholdedrelu_example")

x = np.random.randn(3, 4, 5).astype(np.float32)
y = np.clip(x, alpha, np.inf)
y[y == alpha] = 0

expect(node, inputs=[x], outputs=[y], name="test_thresholdedrelu")
```

</details>


### <a name="Tile"></a><a name="tile">**Tile**</a>

  Constructs a tensor by tiling a given tensor.
  This is the same as function `tile` in Numpy, but no broadcast.
  For example A = [[1, 2], [3, 4]], B = [1, 2], tile(A, B) = [[1, 2, 1, 2], [3, 4, 3, 4]]

#### Version

This version of the operator has been available since version 13 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Tile-1">1</a>, <a href="Changelog.md#Tile-6">6</a>

#### Inputs

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>Input tensor of any shape.</dd>
<dt><tt>repeats</tt> (non-differentiable) : T1</dt>
<dd>1D int64 tensor of the same length as input's dimension number, includes numbers of repeated copies along input's dimensions.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>Output tensor of the same dimensions and type as tensor input. output_dim[i] = input_dim[i] * repeats[i]</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain input and output types to all tensor types.</dd>
<dt><tt>T1</tt> : tensor(int64)</dt>
<dd>Constrain repeat's type to int64 tensors.</dd>
</dl>


#### Examples

<details>
<summary>tile</summary>

```python
node = onnx.helper.make_node("Tile", inputs=["x", "y"], outputs=["z"])

x = np.random.rand(2, 3, 4, 5).astype(np.float32)

repeats = np.random.randint(low=1, high=10, size=(np.ndim(x),)).astype(np.int64)

z = np.tile(x, repeats)

expect(node, inputs=[x, repeats], outputs=[z], name="test_tile")
```

</details>


<details>
<summary>tile_precomputed</summary>

```python
node = onnx.helper.make_node("Tile", inputs=["x", "y"], outputs=["z"])

x = np.array([[0, 1], [2, 3]], dtype=np.float32)

repeats = np.array([2, 2], dtype=np.int64)

z = np.array(
    [[0, 1, 0, 1], [2, 3, 2, 3], [0, 1, 0, 1], [2, 3, 2, 3]], dtype=np.float32
)

expect(node, inputs=[x, repeats], outputs=[z], name="test_tile_precomputed")
```

</details>


### <a name="TopK"></a><a name="topk">**TopK**</a>

  Retrieve the top-K largest or smallest elements along a specified axis. Given an input tensor of
  shape [a_0, a_1, ..., a_{n-1}] and integer argument k, return two outputs:

  * Value tensor of shape [a_0, a_1, ..., a_{axis-1}, k, a_{axis+1}, ... a_{n-1}]
    which contains the values of the top k elements along the specified axis
  * Index tensor of shape [a_0, a_1, ..., a_{axis-1}, k, a_{axis+1}, ... a_{n-1}] which
    contains the indices of the top k elements (original indices from the input
    tensor).

  * If "largest" is 1 (the default value) then the k largest elements are returned.
  * If "sorted" is 1 (the default value) then the resulting k elements will be sorted.
  * If "sorted" is 0, order of returned 'Values' and 'Indices' are undefined.

  Given two equivalent values, this operator uses the indices along the axis as
  a tiebreaker. That is, the element with the lower index will appear first.

#### Version

This version of the operator has been available since version 11 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#TopK-1">1</a>, <a href="Changelog.md#TopK-10">10</a>

#### Attributes

<dl>
<dt><tt>axis</tt> : int (default is -1)</dt>
<dd>Dimension on which to do the sort. Negative value means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(input).</dd>
<dt><tt>largest</tt> : int (default is 1)</dt>
<dd>Whether to return the top-K largest or smallest elements.</dd>
<dt><tt>sorted</tt> : int (default is 1)</dt>
<dd>Whether to return the elements in sorted order.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>Tensor of shape [a_0, a_1, ..., a_{n-1}]</dd>
<dt><tt>K</tt> (non-differentiable) : tensor(int64)</dt>
<dd>A 1-D tensor containing a single positive value corresponding to the number of top elements to retrieve</dd>
</dl>

#### Outputs

<dl>
<dt><tt>Values</tt> (differentiable) : T</dt>
<dd>Tensor of shape [a_0, a_1, ..., a_{axis-1}, k, a_{axis+1}, ... a_{n-1}] containing top K values from the input tensor</dd>
<dt><tt>Indices</tt> (non-differentiable) : I</dt>
<dd>Tensor of shape [a_0, a_1, ..., a_{axis-1}, k, a_{axis+1}, ... a_{n-1}] containing the corresponding input tensor indices for the top K values.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to numeric tensors.</dd>
<dt><tt>I</tt> : tensor(int64)</dt>
<dd>Constrain index tensor to int64</dd>
</dl>


#### Examples

<details>
<summary>top_k</summary>

```python
axis = 1
largest = 1

k = 3
node = onnx.helper.make_node(
    "TopK", inputs=["x", "k"], outputs=["values", "indices"], axis=axis
)
X = np.array(
    [
        [0, 1, 2, 3],
        [4, 5, 6, 7],
        [8, 9, 10, 11],
    ],
    dtype=np.float32,
)
K = np.array([k], dtype=np.int64)
values_ref, indices_ref = topk_sorted_implementation(X, k, axis, largest)

# print(values_ref)
# [[ 3.  2.  1.]
# [ 7.  6.  5.]
# [11. 10.  9.]]
# print(indices_ref)
# [[3 2 1]
# [3 2 1]
# [3 2 1]]

expect(
    node, inputs=[X, K], outputs=[values_ref, indices_ref], name="test_top_k"
)
```

</details>


<details>
<summary>top_k_negative_axis</summary>

```python
axis = -1
largest = 1

k = 3
node = onnx.helper.make_node(
    "TopK", inputs=["x", "k"], outputs=["values", "indices"], axis=axis
)
X = np.array(
    [
        [0, 1, 2, 3],
        [4, 5, 6, 7],
        [8, 9, 10, 11],
    ],
    dtype=np.float32,
)
K = np.array([k], dtype=np.int64)
values_ref, indices_ref = topk_sorted_implementation(X, k, axis, largest)

# print(values_ref)
# [[ 3.  2.  1.]
# [ 7.  6.  5.]
# [11. 10.  9.]]
# print(indices_ref)
# [[3 2 1]
# [3 2 1]
# [3 2 1]]

expect(
    node,
    inputs=[X, K],
    outputs=[values_ref, indices_ref],
    name="test_top_k_negative_axis",
)
```

</details>


<details>
<summary>top_k_smallest</summary>

```python
axis = 1
largest = 0
sorted = 1  # noqa: A001
k = 3

node = onnx.helper.make_node(
    "TopK",
    inputs=["x", "k"],
    outputs=["values", "indices"],
    axis=axis,
    largest=largest,
    sorted=sorted,
)

X = np.array(
    [
        [0, 1, 2, 3],
        [4, 5, 6, 7],
        [11, 10, 9, 8],
    ],
    dtype=np.float32,
)
K = np.array([k], dtype=np.int64)
values_ref, indices_ref = topk_sorted_implementation(X, k, axis, largest)

# print(values_ref)
# [[ 0.  1.  2.]
# [ 4.  5.  6.]
# [ 8.  9. 10.]]
# print(indices_ref)
# [[0 1 2]
# [0 1 2]
# [3 2 1]]

expect(
    node,
    inputs=[X, K],
    outputs=[values_ref, indices_ref],
    name="test_top_k_smallest",
)
```

</details>


### <a name="Transpose"></a><a name="transpose">**Transpose**</a>

  Transpose the input tensor similar to numpy.transpose. For example, when
  perm=(1, 0, 2), given an input tensor of shape (1, 2, 3), the output shape
  will be (2, 1, 3).

#### Version

This version of the operator has been available since version 21 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Transpose-1">1</a>, <a href="Changelog.md#Transpose-13">13</a>

#### Attributes

<dl>
<dt><tt>perm</tt> : list of ints</dt>
<dd>A list of integers. By default, reverse the dimensions, otherwise permute the axes according to the values given. Its length must be equal to the rank of the input.</dd>
</dl>

#### Inputs

<dl>
<dt><tt>data</tt> (differentiable) : T</dt>
<dd>An input tensor.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>transposed</tt> (differentiable) : T</dt>
<dd>Transposed output.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4)</dt>
<dd>Constrain input and output types to all tensor types.</dd>
</dl>


#### Examples

<details>
<summary>all_permutations</summary>

```python
shape = (2, 3, 4)
data = np.random.random_sample(shape).astype(np.float32)
permutations = list(itertools.permutations(np.arange(len(shape))))

for i, permutation in enumerate(permutations):
    node = onnx.helper.make_node(
        "Transpose",
        inputs=["data"],
        outputs=["transposed"],
        perm=permutation,
    )
    transposed = np.transpose(data, permutation)
    expect(
        node,
        inputs=[data],
        outputs=[transposed],
        name=f"test_transpose_all_permutations_{i}",
    )
```

</details>


<details>
<summary>default</summary>

```python
shape = (2, 3, 4)
data = np.random.random_sample(shape).astype(np.float32)

node = onnx.helper.make_node(
    "Transpose", inputs=["data"], outputs=["transposed"]
)

transposed = np.transpose(data)
expect(node, inputs=[data], outputs=[transposed], name="test_transpose_default")
```

</details>


### <a name="Trilu"></a><a name="trilu">**Trilu**</a>

  Given a 2-D matrix or batches of 2-D matrices, returns the upper or lower triangular part of the tensor(s).
  The attribute "upper" determines whether the upper or lower part is retained. If set to true,
  the upper triangular matrix is retained. Lower triangular matrix is retained otherwise.
  Default value for the "upper" attribute is true.
  Trilu takes one input tensor of shape [*, N, M], where * is zero or more batch dimensions. The upper triangular part consists
  of the elements on and above the given diagonal (k). The lower triangular part consists of elements on and below the diagonal.
  All other elements in the matrix are set to zero.
  If k = 0, the triangular part on and above/below the main diagonal is retained.
  If upper is set to true, a positive k retains the upper triangular matrix excluding the main diagonal and (k-1) diagonals above it.
  A negative k value retains the main diagonal and |k| diagonals below it.
  If upper is set to false, a positive k retains the lower triangular matrix including the main diagonal and k diagonals above it.
  A negative k value excludes the main diagonal and (|k|-1) diagonals below it.

#### Version

This version of the operator has been available since version 14 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>upper</tt> : int (default is 1)</dt>
<dd>Boolean. Indicates whether upper or lower part of matrix is retained. Default is true.</dd>
</dl>

#### Inputs (1 - 2)

<dl>
<dt><tt>input</tt> (differentiable) : T</dt>
<dd>Input tensor of rank 2 or higher.</dd>
<dt><tt>k</tt> (optional, non-differentiable) : tensor(int64)</dt>
<dd>A 0-D tensor containing a single value corresponding to the number diagonals above or below the main diagonal to exclude or include. Default value is 0 if it's not specified.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>Output tensor of the same type and shape as the input tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain input and output types to all tensor types.</dd>
</dl>


#### Examples

<details>
<summary>tril</summary>

```python
node = onnx.helper.make_node(
    "Trilu",
    inputs=["x"],
    outputs=["y"],
    upper=0,
)

x = np.random.randint(10, size=(4, 5)).astype(np.int64)
# X:
#  [[4, 7, 3, 7, 9],
#   [1, 2, 8, 6, 9],
#   [9, 4, 1, 8, 7],
#   [4, 3, 4, 2, 4]]
# expect result:
#  [[4, 0, 0, 0, 0],
#   [1, 2, 0, 0, 0],
#   [9, 4, 1, 0, 0],
#   [4, 3, 4, 2, 0]]
y = tril_reference_implementation(x)
expect(node, inputs=[x], outputs=[y], name="test_tril")
```

</details>


<details>
<summary>tril_neg</summary>

```python
node = onnx.helper.make_node(
    "Trilu",
    inputs=["x", "k"],
    outputs=["y"],
    upper=0,
)

x = np.random.randint(10, size=(4, 5)).astype(np.int64)
k = np.array(-1).astype(np.int64)
# X:
#  [[4, 7, 3, 7, 9],
#   [1, 2, 8, 6, 9],
#   [9, 4, 1, 8, 7],
#   [4, 3, 4, 2, 4]]
# expect result:
#  [[0, 0, 0, 0, 0],
#   [1, 0, 0, 0, 0],
#   [9, 4, 0, 0, 0],
#   [4, 3, 4, 0, 0]]
y = tril_reference_implementation(x, int(k))
expect(node, inputs=[x, k], outputs=[y], name="test_tril_neg")
```

</details>


<details>
<summary>tril_one_row</summary>

```python
node = onnx.helper.make_node(
    "Trilu",
    inputs=["x"],
    outputs=["y"],
    upper=0,
)

x = np.random.randint(10, size=(3, 1, 5)).astype(np.int64)
# X:
# [[[6, 2, 4, 1, 6]],
#
#  [[8, 3, 8, 7, 0]],
#
#  [[2, 2, 9, 5, 9]]]
# expect result:
# [[[6, 0, 0, 0, 0]],
#
#  [[8, 0, 0, 0, 0]],
#
#  [[2, 0, 0, 0, 0]]]
y = tril_reference_implementation(x)
expect(node, inputs=[x], outputs=[y], name="test_tril_one_row_neg")
```

</details>


<details>
<summary>tril_out_neg</summary>

```python
node = onnx.helper.make_node(
    "Trilu",
    inputs=["x", "k"],
    outputs=["y"],
    upper=0,
)

x = np.random.randint(10, size=(4, 5)).astype(np.int64)
k = np.array(-7).astype(np.int64)
# X:
#  [[4, 7, 3, 7, 9],
#   [1, 2, 8, 6, 9],
#   [9, 4, 1, 8, 7],
#   [4, 3, 4, 2, 4]]
# expect result:
#  [[0, 0, 0, 0, 0],
#   [0, 0, 0, 0, 0],
#   [0, 0, 0, 0, 0],
#   [0, 0, 0, 0, 0]]
y = tril_reference_implementation(x, int(k))
expect(node, inputs=[x, k], outputs=[y], name="test_tril_out_neg")
```

</details>


<details>
<summary>tril_out_pos</summary>

```python
node = onnx.helper.make_node(
    "Trilu",
    inputs=["x", "k"],
    outputs=["y"],
    upper=0,
)
x = np.random.randint(10, size=(4, 5)).astype(np.int64)
k = np.array(6).astype(np.int64)
# X:
#  [[4, 7, 3, 7, 9],
#   [1, 2, 8, 6, 9],
#   [9, 4, 1, 8, 7],
#   [4, 3, 4, 2, 4]]
# expect result:
#  [[4, 7, 3, 7, 9],
#   [1, 2, 8, 6, 9],
#   [9, 4, 1, 8, 7],
#   [4, 3, 4, 2, 4]]
y = tril_reference_implementation(x, int(k))
expect(node, inputs=[x, k], outputs=[y], name="test_tril_out_pos")
```

</details>


<details>
<summary>tril_pos</summary>

```python
node = onnx.helper.make_node(
    "Trilu",
    inputs=["x", "k"],
    outputs=["y"],
    upper=0,
)

x = np.random.randint(10, size=(4, 5)).astype(np.int64)
k = np.array(2).astype(np.int64)
# X:
#  [[4, 7, 3, 7, 9],
#   [1, 2, 8, 6, 9],
#   [9, 4, 1, 8, 7],
#   [4, 3, 4, 2, 4]]
# expect result:
#  [[4, 7, 3, 0, 0],
#   [1, 2, 8, 6, 0],
#   [9, 4, 1, 8, 7],
#   [4, 3, 4, 2, 4]]
y = tril_reference_implementation(x, int(k))
expect(node, inputs=[x, k], outputs=[y], name="test_tril_pos")
```

</details>


<details>
<summary>tril_square</summary>

```python
node = onnx.helper.make_node(
    "Trilu",
    inputs=["x"],
    outputs=["y"],
    upper=0,
)

x = np.random.randint(10, size=(2, 3, 3)).astype(np.int64)
# X:
# [[[0, 4, 3],
#   [2, 0, 9],
#   [8, 2, 5]],
#
#  [[2, 7, 2],
#   [2, 6, 0],
#   [2, 6, 5]]]
# expect result:
# [[[0, 0, 0],
#   [2, 0, 0],
#   [8, 2, 5]],
#
#  [[2, 0, 0],
#   [2, 6, 0],
#   [2, 6, 5]]]
y = tril_reference_implementation(x)
expect(node, inputs=[x], outputs=[y], name="test_tril_square")
```

</details>


<details>
<summary>tril_square_neg</summary>

```python
node = onnx.helper.make_node(
    "Trilu",
    inputs=["x", "k"],
    outputs=["y"],
    upper=0,
)

x = np.random.randint(10, size=(2, 3, 3)).astype(np.int64)
k = np.array(-1).astype(np.int64)
# X:
# [[[0, 4, 3],
#   [2, 0, 9],
#   [8, 2, 5]],
#
#  [[2, 7, 2],
#   [2, 6, 0],
#   [2, 6, 5]]]
# expect result:
# [[[0, 0, 0],
#   [2, 0, 0],
#   [8, 2, 0]],
#
#  [[0, 0, 0],
#   [2, 0, 0],
#   [2, 6, 0]]]
y = tril_reference_implementation(x, int(k))
expect(node, inputs=[x, k], outputs=[y], name="test_tril_square_neg")
```

</details>


<details>
<summary>tril_zero</summary>

```python
node = onnx.helper.make_node(
    "Trilu",
    inputs=["x", "k"],
    outputs=["y"],
    upper=0,
)

x = np.random.randint(10, size=(3, 0, 5)).astype(np.int64)
k = np.array(6).astype(np.int64)
# X:
# []
# expect result:
# []
y = tril_reference_implementation(x, int(k))
expect(node, inputs=[x, k], outputs=[y], name="test_tril_zero")
```

</details>


<details>
<summary>triu</summary>

```python
node = onnx.helper.make_node(
    "Trilu",
    inputs=["x"],
    outputs=["y"],
)

x = np.random.randint(10, size=(4, 5)).astype(np.int64)
# X:
#  [[4, 7, 3, 7, 9],
#   [1, 2, 8, 6, 9],
#   [9, 4, 0, 8, 7],
#   [4, 3, 4, 2, 4]]
# expect result:
#  [[4, 7, 3, 7, 9],
#   [0, 2, 8, 6, 9],
#   [0, 0, 0, 8, 7],
#   [0, 0, 0, 2, 4]]
y = triu_reference_implementation(x)
expect(node, inputs=[x], outputs=[y], name="test_triu")
```

</details>


<details>
<summary>triu_neg</summary>

```python
node = onnx.helper.make_node(
    "Trilu",
    inputs=["x", "k"],
    outputs=["y"],
)

x = np.random.randint(10, size=(4, 5)).astype(np.int64)
k = np.array(-1).astype(np.int64)
# X:
#  [[4, 7, 3, 7, 9],
#   [1, 2, 8, 6, 9],
#   [9, 4, 0, 8, 7],
#   [4, 3, 4, 2, 4]]
# expect result:
#  [[4, 7, 3, 7, 9],
#   [1, 2, 8, 6, 9],
#   [0, 4, 0, 8, 7],
#   [0, 0, 4, 2, 4]]
y = triu_reference_implementation(x, int(k))
expect(node, inputs=[x, k], outputs=[y], name="test_triu_neg")
```

</details>


<details>
<summary>triu_one_row</summary>

```python
node = onnx.helper.make_node(
    "Trilu",
    inputs=["x", "k"],
    outputs=["y"],
)

x = np.random.randint(10, size=(3, 1, 5)).astype(np.int64)
k = np.array(1).astype(np.int64)
# X:
# [[[1, 4, 9, 7, 1]],
#
#  [[9, 2, 8, 8, 4]],
#
#  [[3, 9, 7, 4, 2]]]
# expect result:
# [[[0, 4, 9, 7, 1]],
#
#  [[0, 2, 8, 8, 4]],
#
#  [[0, 9, 7, 4, 2]]]
y = triu_reference_implementation(x, int(k))
expect(node, inputs=[x, k], outputs=[y], name="test_triu_one_row")
```

</details>


<details>
<summary>triu_out_neg_out</summary>

```python
node = onnx.helper.make_node(
    "Trilu",
    inputs=["x", "k"],
    outputs=["y"],
)

x = np.random.randint(10, size=(4, 5)).astype(np.int64)
k = np.array(-7).astype(np.int64)
# X:
#  [[4, 7, 3, 7, 9],
#   [1, 2, 8, 6, 9],
#   [9, 4, 0, 8, 7],
#   [4, 3, 4, 2, 4]]
# expect result:
#  [[4, 7, 3, 7, 9],
#   [1, 2, 8, 6, 9],
#   [9, 4, 0, 8, 7],
#   [4, 3, 4, 2, 4]]
y = triu_reference_implementation(x, int(k))
expect(node, inputs=[x, k], outputs=[y], name="test_triu_out_neg_out")
```

</details>


<details>
<summary>triu_out_pos</summary>

```python
node = onnx.helper.make_node(
    "Trilu",
    inputs=["x", "k"],
    outputs=["y"],
)

x = np.random.randint(10, size=(4, 5)).astype(np.int64)
k = np.array(6).astype(np.int64)
# X:
#  [[4, 7, 3, 7, 9],
#   [1, 2, 8, 6, 9],
#   [9, 4, 0, 8, 7],
#   [4, 3, 4, 2, 4]]
# expect result:
#  [[0, 0, 0, 0, 0],
#   [0, 0, 0, 0, 0],
#   [0, 0, 0, 0, 0],
#   [0, 0, 0, 0, 0]]
y = triu_reference_implementation(x, int(k))
expect(node, inputs=[x, k], outputs=[y], name="test_triu_out_pos")
```

</details>


<details>
<summary>triu_pos</summary>

```python
node = onnx.helper.make_node(
    "Trilu",
    inputs=["x", "k"],
    outputs=["y"],
)

x = np.random.randint(10, size=(4, 5)).astype(np.int64)
k = np.array(2).astype(np.int64)
# X:
#  [[4, 7, 3, 7, 9],
#   [1, 2, 8, 6, 9],
#   [9, 4, 0, 8, 7],
#   [4, 3, 4, 2, 4]]
# expect result:
#  [[0, 0, 3, 7, 9],
#   [0, 0, 0, 6, 9],
#   [0, 0, 0, 0, 7],
#   [0, 0, 0, 0, 0]]
y = triu_reference_implementation(x, int(k))
expect(node, inputs=[x, k], outputs=[y], name="test_triu_pos")
```

</details>


<details>
<summary>triu_square</summary>

```python
node = onnx.helper.make_node(
    "Trilu",
    inputs=["x"],
    outputs=["y"],
)

x = np.random.randint(10, size=(2, 3, 3)).astype(np.int64)
y = triu_reference_implementation(x)
# X:
# [[[4, 6, 9],
#   [7, 5, 4],
#   [8, 1, 2]],
#
#  [[1, 4, 9],
#   [9, 6, 3],
#   [8, 9, 8]]]
# expect result:
# [[[4, 6, 9],
#   [0, 5, 4],
#   [0, 0, 2]],
#
#  [[1, 4, 9],
#   [0, 6, 3],
#   [0, 0, 8]]]
expect(node, inputs=[x], outputs=[y], name="test_triu_square")
```

</details>


<details>
<summary>triu_square_neg</summary>

```python
node = onnx.helper.make_node(
    "Trilu",
    inputs=["x", "k"],
    outputs=["y"],
)

x = np.random.randint(10, size=(2, 3, 3)).astype(np.int64)
k = np.array(-1).astype(np.int64)
# X:
# [[[4, 6, 9],
#   [7, 5, 4],
#   [8, 1, 2]],
#
#  [[1, 4, 9],
#   [9, 6, 3],
#   [8, 9, 8]]]
# expect result:
# [[[4, 6, 9],
#   [7, 5, 4],
#   [0, 1, 2]],
#
#  [[1, 4, 9],
#   [9, 6, 3],
#   [0, 9, 8]]]
y = triu_reference_implementation(x, int(k))
expect(node, inputs=[x, k], outputs=[y], name="test_triu_square_neg")
```

</details>


<details>
<summary>triu_zero</summary>

```python
node = onnx.helper.make_node(
    "Trilu",
    inputs=["x", "k"],
    outputs=["y"],
)

x = np.random.randint(10, size=(0, 5)).astype(np.int64)
k = np.array(6).astype(np.int64)
# X:
# []
# expect result:
# []
y = triu_reference_implementation(x, int(k))
expect(node, inputs=[x, k], outputs=[y], name="test_triu_zero")
```

</details>


### <a name="Unique"></a><a name="unique">**Unique**</a>

  Find the unique elements of a tensor. When an optional attribute 'axis' is provided, unique subtensors sliced along the 'axis' are returned.
  Otherwise the input tensor is flattened and unique values of the flattened tensor are returned.

  This operator returns the unique values or sliced unique subtensors of the input tensor and three optional outputs.
  The first output tensor 'Y' contains all unique values or subtensors of the input.
  The second optional output tensor 'indices' contains indices of 'Y' elements' first occurrence in 'X'.
  The third optional output tensor 'inverse_indices' contains, for elements of 'X', its corresponding indices in 'Y'.
  The fourth optional output tensor 'counts' contains the count of each element of 'Y' in the input.

  Outputs are either sorted in ascending order or optionally in the order of the first occurrence of the values in the input.

  https://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html

  Example 1:
  ```
  input_X = [2, 1, 1, 3, 4, 3]
  attribute_sorted = 0
  attribute_axis = None
  output_Y = [2, 1, 3, 4]
  output_indices = [0, 1, 3, 4]
  output_inverse_indices = [0, 1, 1, 2, 3, 2]
  output_counts = [1, 2, 2, 1]
  ```

  Example 2:
  ```
  input_X = [[1, 3], [2, 3]]
  attribute_sorted = 1
  attribute_axis = None
  output_Y = [1, 2, 3]
  output_indices = [0, 2, 1]
  output_inverse_indices = [0, 2, 1, 2]
  output_counts = [1, 1, 2]
  ```

  Example 3:
  ```
  input_X = [[1, 0, 0], [1, 0, 0], [2, 3, 4]]
  attribute_sorted = 1
  attribute_axis = 0
  output_Y = [[1, 0, 0], [2, 3, 4]]
  output_indices = [0, 2]
  output_inverse_indices = [0, 0, 1]
  output_counts = [2, 1]
  ```

  Example 4:
  ```
  input_x = [[[1., 1.], [0., 1.], [2., 1.], [0., 1.]],
              [[1., 1.], [0., 1.], [2., 1.], [0., 1.]]]
  attribute_sorted = 1
  attribute_axis = 1
  ```

  intermediate data are presented below for better understanding:
  there are 4 subtensors sliced along axis 1 of input_x (shape = (2, 4, 2)):
  ```
  A: [[1, 1], [1, 1]],
     [[0, 1], [0, 1]],
     [[2, 1], [2, 1]],
     [[0, 1], [0, 1]].
  ```

  there are 3 unique subtensors:
  ```
  [[1, 1], [1, 1]],
  [[0, 1], [0, 1]],
  [[2, 1], [2, 1]].
  ```

  sorted unique subtensors:
  ```
  B: [[0, 1], [0, 1]],
     [[1, 1], [1, 1]],
     [[2, 1], [2, 1]].
  ```

  output_Y is constructed from B:
  ```
  [[[0. 1.], [1. 1.], [2. 1.]],
   [[0. 1.], [1. 1.], [2. 1.]]]
  ```

  output_indices is to map from B to A:
  ```
  [1, 0, 2]
  ```

  output_inverse_indices is to map from A to B:
  ```
  [1, 0, 2, 0]
  ```

  output_counts:
  ```
  [2, 1, 1]
  ```

#### Version

This version of the operator has been available since version 11 of the default ONNX operator set.

#### Attributes

<dl>
<dt><tt>axis</tt> : int</dt>
<dd>(Optional) The dimension to apply unique. If not specified, the unique elements of the flattened input are returned. Negative value means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(input).</dd>
<dt><tt>sorted</tt> : int (default is 1)</dt>
<dd>(Optional) Whether to sort the unique elements in ascending order before returning as output. Must be one of 0, or 1 (default).</dd>
</dl>

#### Inputs

<dl>
<dt><tt>X</tt> (non-differentiable) : T</dt>
<dd>A N-D input tensor that is to be processed.</dd>
</dl>

#### Outputs (1 - 4)

<dl>
<dt><tt>Y</tt> (non-differentiable) : T</dt>
<dd>A tensor of the same type as 'X' containing all the unique values or subtensors sliced along a provided 'axis' in 'X', either sorted or maintained in the same order they occur in input 'X'</dd>
<dt><tt>indices</tt> (optional, non-differentiable) : tensor(int64)</dt>
<dd>A 1-D INT64 tensor containing indices of 'Y' elements' first occurrence in 'X'. When 'axis' is provided, it contains indices to subtensors in input 'X' on the 'axis'. When 'axis' is not provided, it contains indices to values in the flattened input tensor. </dd>
<dt><tt>inverse_indices</tt> (optional, non-differentiable) : tensor(int64)</dt>
<dd>A 1-D INT64 tensor containing, for elements of 'X', its corresponding indices in 'Y'. When 'axis' is provided, it contains indices to subtensors in output 'Y' on the 'axis'. When 'axis' is not provided, it contains indices to values in output 'Y'. </dd>
<dt><tt>counts</tt> (optional, non-differentiable) : tensor(int64)</dt>
<dd>A 1-D INT64 tensor containing the count of each element of 'Y' in input 'X'</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Input can be of any tensor type.</dd>
</dl>


#### Examples

<details>
<summary>not_sorted_without_axis</summary>

```python
node_not_sorted = onnx.helper.make_node(
    "Unique",
    inputs=["X"],
    outputs=["Y", "indices", "inverse_indices", "counts"],
    sorted=0,
)
# numpy unique does not retain original order (it sorts the output unique values)
# https://github.com/numpy/numpy/issues/8621
# we need to recover unsorted output and indices
x = np.array([2.0, 1.0, 1.0, 3.0, 4.0, 3.0], dtype=np.float32)
y, indices, inverse_indices, counts = np.unique(x, True, True, True)

# prepare index mapping from sorted to unsorted
argsorted_indices = np.argsort(indices)
inverse_indices_map = dict(
    zip(argsorted_indices, np.arange(len(argsorted_indices)))
)

indices = indices[argsorted_indices]
y = np.take(x, indices, axis=0)
inverse_indices = np.asarray(
    [inverse_indices_map[i] for i in inverse_indices], dtype=np.int64
)
counts = counts[argsorted_indices]
indices, inverse_indices, counts = specify_int64(
    indices, inverse_indices, counts
)
# print(y)
# [2.0, 1.0, 3.0, 4.0]
# print(indices)
# [0 1 3 4]
# print(inverse_indices)
# [0, 1, 1, 2, 3, 2]
# print(counts)
# [1, 2, 2, 1]

expect(
    node_not_sorted,
    inputs=[x],
    outputs=[y, indices, inverse_indices, counts],
    name="test_unique_not_sorted_without_axis",
)
```

</details>


<details>
<summary>sorted_with_axis</summary>

```python
node_sorted = onnx.helper.make_node(
    "Unique",
    inputs=["X"],
    outputs=["Y", "indices", "inverse_indices", "counts"],
    sorted=1,
    axis=0,
)

x = np.array([[1, 0, 0], [1, 0, 0], [2, 3, 4]], dtype=np.float32)
y, indices, inverse_indices, counts = np.unique(x, True, True, True, axis=0)
indices, inverse_indices, counts = specify_int64(
    indices, inverse_indices, counts
)
# print(y)
# [[1. 0. 0.]
#  [2. 3. 4.]]
# print(indices)
# [0 2]
# print(inverse_indices)
# [0 0 1]
# print(counts)
# [2 1]

expect(
    node_sorted,
    inputs=[x],
    outputs=[y, indices, inverse_indices, counts],
    name="test_unique_sorted_with_axis",
)
```

</details>


<details>
<summary>sorted_with_axis_3d</summary>

```python
node_sorted = onnx.helper.make_node(
    "Unique",
    inputs=["X"],
    outputs=["Y", "indices", "inverse_indices", "counts"],
    sorted=1,
    axis=1,
)

x = np.array(
    [
        [[1.0, 1.0], [0.0, 1.0], [2.0, 1.0], [0.0, 1.0]],
        [[1.0, 1.0], [0.0, 1.0], [2.0, 1.0], [0.0, 1.0]],
    ],
    dtype=np.float32,
)
y, indices, inverse_indices, counts = np.unique(x, True, True, True, axis=1)
indices, inverse_indices, counts = specify_int64(
    indices, inverse_indices, counts
)
# print(y)
# [[[0. 1.]
#  [1. 1.]
#  [2. 1.]]
# [[0. 1.]
#  [1. 1.]
#  [2. 1.]]]
# print(indices)
# [1 0 2]
# print(inverse_indices)
# [1 0 2 0]
# print(counts)
# [2 1 1]
expect(
    node_sorted,
    inputs=[x],
    outputs=[y, indices, inverse_indices, counts],
    name="test_unique_sorted_with_axis_3d",
)
```

</details>


<details>
<summary>sorted_with_negative_axis</summary>

```python
node_sorted = onnx.helper.make_node(
    "Unique",
    inputs=["X"],
    outputs=["Y", "indices", "inverse_indices", "counts"],
    sorted=1,
    axis=-1,
)

x = np.array([[1, 0, 0], [1, 0, 0], [2, 3, 3]], dtype=np.float32)
y, indices, inverse_indices, counts = np.unique(x, True, True, True, axis=-1)
indices, inverse_indices, counts = specify_int64(
    indices, inverse_indices, counts
)
# print(y)
# [[0. 1.]
#  [0. 1.]
#  [3. 2.]]
# print(indices)
# [1 0]
# print(inverse_indices)
# [1 0 0]
# print(counts)
# [2 1]

expect(
    node_sorted,
    inputs=[x],
    outputs=[y, indices, inverse_indices, counts],
    name="test_unique_sorted_with_negative_axis",
)
```

</details>


<details>
<summary>sorted_without_axis</summary>

```python
node_sorted = onnx.helper.make_node(
    "Unique",
    inputs=["X"],
    outputs=["Y", "indices", "inverse_indices", "counts"],
)

x = np.array([2.0, 1.0, 1.0, 3.0, 4.0, 3.0], dtype=np.float32)
y, indices, inverse_indices, counts = np.unique(x, True, True, True)
indices, inverse_indices, counts = specify_int64(
    indices, inverse_indices, counts
)
expect(
    node_sorted,
    inputs=[x],
    outputs=[y, indices, inverse_indices, counts],
    name="test_unique_sorted_without_axis",
)
```

</details>


### <a name="Unsqueeze"></a><a name="unsqueeze">**Unsqueeze**</a>

  Insert single-dimensional entries to the shape of an input tensor (`data`).
  Takes one required input `axes` - which contains a list of dimension indices and this operator will insert a dimension of value `1` into the corresponding index of the output tensor (`expanded`).

  For example, given an input tensor (`data`) of shape [3, 4, 5], then
  Unsqueeze(data, axes=[0, 4]) outputs a tensor (`expanded`) containing same data as `data` but with shape [1, 3, 4, 5, 1].

  The input `axes` should not contain any duplicate entries. It is an error if it contains duplicates.
  The rank of the output tensor (`output_rank`) is the rank of the input tensor (`data`) plus the number of values in `axes`.
  Each value in `axes` should be within the (inclusive) range [-output_rank , output_rank - 1].
  The order of values in `axes` does not matter and can come in any order.

#### Version

This version of the operator has been available since version 21 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Unsqueeze-1">1</a>, <a href="Changelog.md#Unsqueeze-11">11</a>, <a href="Changelog.md#Unsqueeze-13">13</a>

#### Inputs

<dl>
<dt><tt>data</tt> (differentiable) : T</dt>
<dd>Original tensor</dd>
<dt><tt>axes</tt> (non-differentiable) : tensor(int64)</dt>
<dd>List of integers indicating the dimensions to be inserted. Negative value means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(expanded).</dd>
</dl>

#### Outputs

<dl>
<dt><tt>expanded</tt> (differentiable) : T</dt>
<dd>Reshaped tensor with same data as input.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128), tensor(float8e4m3fn), tensor(float8e4m3fnuz), tensor(float8e5m2), tensor(float8e5m2fnuz), tensor(uint4), tensor(int4)</dt>
<dd>Constrain input and output types to all tensor types up to IRv10.</dd>
</dl>


#### Examples

<details>
<summary>unsqueeze_negative_axes</summary>

```python
node = onnx.helper.make_node(
    "Unsqueeze",
    inputs=["x", "axes"],
    outputs=["y"],
)
x = np.random.randn(1, 3, 1, 5).astype(np.float32)
axes = np.array([-2]).astype(np.int64)
y = np.expand_dims(x, axis=-2)
expect(node, inputs=[x, axes], outputs=[y], name="test_unsqueeze_negative_axes")
```

</details>


<details>
<summary>unsqueeze_one_axis</summary>

```python
x = np.random.randn(3, 4, 5).astype(np.float32)

for i in range(x.ndim):
    axes = np.array([i]).astype(np.int64)
    node = onnx.helper.make_node(
        "Unsqueeze",
        inputs=["x", "axes"],
        outputs=["y"],
    )
    y = np.expand_dims(x, axis=i)

    expect(
        node,
        inputs=[x, axes],
        outputs=[y],
        name="test_unsqueeze_axis_" + str(i),
    )
```

</details>


<details>
<summary>unsqueeze_three_axes</summary>

```python
x = np.random.randn(3, 4, 5).astype(np.float32)
axes = np.array([2, 4, 5]).astype(np.int64)

node = onnx.helper.make_node(
    "Unsqueeze",
    inputs=["x", "axes"],
    outputs=["y"],
)
y = np.expand_dims(x, axis=2)
y = np.expand_dims(y, axis=4)
y = np.expand_dims(y, axis=5)

expect(node, inputs=[x, axes], outputs=[y], name="test_unsqueeze_three_axes")
```

</details>


<details>
<summary>unsqueeze_two_axes</summary>

```python
x = np.random.randn(3, 4, 5).astype(np.float32)
axes = np.array([1, 4]).astype(np.int64)

node = onnx.helper.make_node(
    "Unsqueeze",
    inputs=["x", "axes"],
    outputs=["y"],
)
y = np.expand_dims(x, axis=1)
y = np.expand_dims(y, axis=4)

expect(node, inputs=[x, axes], outputs=[y], name="test_unsqueeze_two_axes")
```

</details>


<details>
<summary>unsqueeze_unsorted_axes</summary>

```python
x = np.random.randn(3, 4, 5).astype(np.float32)
axes = np.array([5, 4, 2]).astype(np.int64)

node = onnx.helper.make_node(
    "Unsqueeze",
    inputs=["x", "axes"],
    outputs=["y"],
)
y = np.expand_dims(x, axis=2)
y = np.expand_dims(y, axis=4)
y = np.expand_dims(y, axis=5)

expect(node, inputs=[x, axes], outputs=[y], name="test_unsqueeze_unsorted_axes")
```

</details>


### <a name="Upsample"></a><a name="upsample">**Upsample** (deprecated)</a>

  Upsample the input tensor.
  Each dimension value of the output tensor is:
    output_dimension = floor(input_dimension * scale).

#### Version

This version of the operator has been deprecated since version 10 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Upsample-7">7</a>, <a href="Changelog.md#Upsample-9">9</a>


#### Examples

<details>
<summary>nearest</summary>

```python
node = onnx.helper.make_node(
    "Upsample",
    inputs=["X", "scales"],
    outputs=["Y"],
    mode="nearest",
)

data = np.array(
    [
        [
            [
                [1, 2],
                [3, 4],
            ]
        ]
    ],
    dtype=np.float32,
)

scales = np.array([1.0, 1.0, 2.0, 3.0], dtype=np.float32)

output = np.array(
    [
        [
            [
                [1, 1, 1, 2, 2, 2],
                [1, 1, 1, 2, 2, 2],
                [3, 3, 3, 4, 4, 4],
                [3, 3, 3, 4, 4, 4],
            ]
        ]
    ],
    dtype=np.float32,
)

expect(
    node,
    inputs=[data, scales],
    outputs=[output],
    name="test_upsample_nearest",
    opset_imports=[helper.make_opsetid("", 9)],
)
```

</details>


### <a name="Where"></a><a name="where">**Where**</a>

  Return elements, either from X or Y, depending on condition.
  Where behaves like
  [numpy.where](https://docs.scipy.org/doc/numpy/reference/generated/numpy.where.html)
  with three parameters.

  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 16 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Where-9">9</a>

#### Inputs

<dl>
<dt><tt>condition</tt> (non-differentiable) : B</dt>
<dd>When True (nonzero), yield X, otherwise yield Y</dd>
<dt><tt>X</tt> (differentiable) : T</dt>
<dd>values selected at indices where condition is True</dd>
<dt><tt>Y</tt> (differentiable) : T</dt>
<dd>values selected at indices where condition is False</dd>
</dl>

#### Outputs

<dl>
<dt><tt>output</tt> (differentiable) : T</dt>
<dd>Tensor of shape equal to the broadcasted shape of condition, X, and Y.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>B</tt> : tensor(bool)</dt>
<dd>Constrain to boolean tensors.</dd>
<dt><tt>T</tt> : tensor(uint8), tensor(uint16), tensor(uint32), tensor(uint64), tensor(int8), tensor(int16), tensor(int32), tensor(int64), tensor(bfloat16), tensor(float16), tensor(float), tensor(double), tensor(string), tensor(bool), tensor(complex64), tensor(complex128)</dt>
<dd>Constrain input and output types to all tensor types (including bfloat).</dd>
</dl>


#### Examples

<details>
<summary>long</summary>

```python
node = onnx.helper.make_node(
    "Where",
    inputs=["condition", "x", "y"],
    outputs=["z"],
)

condition = np.array([[1, 0], [1, 1]], dtype=bool)
x = np.array([[1, 2], [3, 4]], dtype=np.int64)
y = np.array([[9, 8], [7, 6]], dtype=np.int64)
z = np.where(condition, x, y)  # expected output [[1, 8], [3, 4]]
expect(
    node, inputs=[condition, x, y], outputs=[z], name="test_where_long_example"
)
```

</details>


<details>
<summary>where</summary>

```python
node = onnx.helper.make_node(
    "Where",
    inputs=["condition", "x", "y"],
    outputs=["z"],
)

condition = np.array([[1, 0], [1, 1]], dtype=bool)
x = np.array([[1, 2], [3, 4]], dtype=np.float32)
y = np.array([[9, 8], [7, 6]], dtype=np.float32)
z = np.where(condition, x, y)  # expected output [[1, 8], [3, 4]]
expect(node, inputs=[condition, x, y], outputs=[z], name="test_where_example")
```

</details>


### <a name="Xor"></a><a name="xor">**Xor**</a>

  Returns the tensor resulted from performing the `xor` logical operation
  elementwise on the input tensors `A` and `B` (with Numpy-style broadcasting support).

  This operator supports **multidirectional (i.e., Numpy-style) broadcasting**; for more details please check [the doc](Broadcasting.md).

#### Version

This version of the operator has been available since version 7 of the default ONNX operator set.

Other versions of this operator: <a href="Changelog.md#Xor-1">1</a>

#### Inputs

<dl>
<dt><tt>A</tt> (non-differentiable) : T</dt>
<dd>First input operand for the logical operator.</dd>
<dt><tt>B</tt> (non-differentiable) : T</dt>
<dd>Second input operand for the logical operator.</dd>
</dl>

#### Outputs

<dl>
<dt><tt>C</tt> (non-differentiable) : T1</dt>
<dd>Result tensor.</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T</tt> : tensor(bool)</dt>
<dd>Constrain input to boolean tensor.</dd>
<dt><tt>T1</tt> : tensor(bool)</dt>
<dd>Constrain output to boolean tensor.</dd>
</dl>


#### Examples

<details>
<summary>xor</summary>

```python
node = onnx.helper.make_node(
    "Xor",
    inputs=["x", "y"],
    outputs=["xor"],
)

# 2d
x = (np.random.randn(3, 4) > 0).astype(bool)
y = (np.random.randn(3, 4) > 0).astype(bool)
z = np.logical_xor(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_xor2d")

# 3d
x = (np.random.randn(3, 4, 5) > 0).astype(bool)
y = (np.random.randn(3, 4, 5) > 0).astype(bool)
z = np.logical_xor(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_xor3d")

# 4d
x = (np.random.randn(3, 4, 5, 6) > 0).astype(bool)
y = (np.random.randn(3, 4, 5, 6) > 0).astype(bool)
z = np.logical_xor(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_xor4d")
```

</details>


<details>
<summary>xor_broadcast</summary>

```python
node = onnx.helper.make_node(
    "Xor",
    inputs=["x", "y"],
    outputs=["xor"],
)

# 3d vs 1d
x = (np.random.randn(3, 4, 5) > 0).astype(bool)
y = (np.random.randn(5) > 0).astype(bool)
z = np.logical_xor(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_xor_bcast3v1d")

# 3d vs 2d
x = (np.random.randn(3, 4, 5) > 0).astype(bool)
y = (np.random.randn(4, 5) > 0).astype(bool)
z = np.logical_xor(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_xor_bcast3v2d")

# 4d vs 2d
x = (np.random.randn(3, 4, 5, 6) > 0).astype(bool)
y = (np.random.randn(5, 6) > 0).astype(bool)
z = np.logical_xor(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_xor_bcast4v2d")

# 4d vs 3d
x = (np.random.randn(3, 4, 5, 6) > 0).astype(bool)
y = (np.random.randn(4, 5, 6) > 0).astype(bool)
z = np.logical_xor(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_xor_bcast4v3d")

# 4d vs 4d
x = (np.random.randn(1, 4, 1, 6) > 0).astype(bool)
y = (np.random.randn(3, 1, 5, 6) > 0).astype(bool)
z = np.logical_xor(x, y)
expect(node, inputs=[x, y], outputs=[z], name="test_xor_bcast4v4d")
```

</details>


## ai.onnx.preview.training
### <a name="ai.onnx.preview.training.Adagrad"></a><a name="ai.onnx.preview.training.adagrad">**ai.onnx.preview.training.Adagrad**</a>

  Compute one iteration of ADAGRAD, a stochastic gradient based optimization
      algorithm. This operator can conduct the optimization of multiple tensor variables.

      Let's define the behavior of this operator. As you can imagine, ADAGRAD requires
      some parameters:

       - The initial learning-rate "R".
       - The update count "T". That is, the number of training iterations conducted.
       - A L2-norm regularization coefficient "norm_coefficient".
       - A learning-rate decay factor "decay_factor".
       - A small constant "epsilon" to avoid dividing-by-zero.

      At each ADAGRAD iteration, the optimized tensors are moved along a direction
      computed based on their estimated gradient and accumulated squared gradient. Assume
      that only a single tensor "X" is updated by this operator. We need the value of "X",
      its gradient "G", and its accumulated squared gradient "H". Therefore, variables in
      this operator's input list are sequentially "R", "T", "X", "G", and "H". Other
      parameters are given as attributes because they are usually constants. Also, the
      corresponding output tensors are the new value of "X" (called "X_new"), and then
      the new accumulated squared gradient (called "H_new"). Those outputs are computed
      from the given inputs following the pseudo code below.

      Let "+", "-", "*", and "/" are all element-wise arithmetic operations with
      numpy-style broadcasting support. The pseudo code to compute those outputs is:

        // Compute a scalar learning-rate factor. At the first update of X, T is generally
        // 0 (0-based update index) or 1 (1-based update index).
        r = R / (1 + T * decay_factor);

        // Add gradient of 0.5 * norm_coefficient * ||X||_2^2, where ||X||_2 is the 2-norm.
        G_regularized = norm_coefficient * X + G;

        // Compute new accumulated squared gradient.
        H_new = H + G_regularized * G_regularized;

        // Compute the adaptive part of per-coordinate learning rate. Note that Sqrt(...)
        // computes element-wise square-root.
        H_adaptive = Sqrt(H_new) + epsilon

        // Compute the new value of "X".
        X_new = X - r * G_regularized / H_adaptive;

      If one assign this operators to optimize multiple inputs, for example, "X_1" and "X_2", the same
      pseudo code may be extended to handle all tensors jointly. More specifically, we can view "X" as a
      concatenation of "X_1" and "X_2" (of course, their gradient and accumulate gradient should
      be concatenated too) and then just reuse the entire pseudo code.

      Note that ADAGRAD was first proposed in http://jmlr.org/papers/volume12/duchi11a/duchi11a.pdf.
      In that reference paper, this operator is a special case of the Figure 1's composite mirror
      descent update.

#### Version

This version of the operator has been available since version 1 of the 'ai.onnx.preview.training' operator set.

#### Attributes

<dl>
<dt><tt>decay_factor</tt> : float (default is 0.0)</dt>
<dd>The decay factor of learning rate after one update.The effective learning rate is computed by r = R / (1 + T * decay_factor). Default to 0 so that increasing update counts doesn't reduce the learning rate.</dd>
<dt><tt>epsilon</tt> : float (default is 0.0)</dt>
<dd>Small scalar to avoid dividing by zero.</dd>
<dt><tt>norm_coefficient</tt> : float (default is 0.0)</dt>
<dd>Regularization coefficient in 0.5 * norm_coefficient * ||X||_2^2. Default to 0, which means no regularization.</dd>
</dl>

#### Inputs (3 - &#8734;)

<dl>
<dt><tt>R</tt> : T1</dt>
<dd>The initial learning rate.</dd>
<dt><tt>T</tt> : T2</dt>
<dd>The update count of "X". It should be a scalar.</dd>
<dt><tt>inputs</tt> (variadic, heterogeneous) : T3</dt>
<dd>The current values of optimized tensors, followed by their respective gradients, followed by their respective accumulated squared gradients.For example, if two tensor "X_1" and "X_2" are optimized, The input list would be ["X_1", "X_2", gradient of "X_1", gradient of "X_2", accumulated squared gradient of "X_1", accumulated squared gradient of "X_2"].</dd>
</dl>

#### Outputs (1 - &#8734;)

<dl>
<dt><tt>outputs</tt> (variadic, heterogeneous) : T3</dt>
<dd>Updated values of optimized tensors, followed by their updated values of accumulated squared gradients. For example, if two tensor "X_1" and "X_2" are optimized, the output list would be [new value of "X_1," new value of "X_2" new accumulated squared gradient of "X_1", new accumulated squared gradient of "X_2"].</dd>
</dl>

#### Type Constraints

<dl>
<dt><tt>T1</tt> : tensor(float), tensor(double)</dt>
<dd>Constrain input types to float scalars.</dd>
<dt><tt>T2</tt> : tensor(int64)</dt>
<dd>Constrain input types to 64-bit integer scalars.</dd>
<dt><tt>T3</tt> : tensor(float), tensor(double)</dt>
<dd>Constrain input and output types to float tensors.</dd>
</dl>


#### Examples

<details>
<summary>adagrad</summary>

```python
# Define operator attributes.
norm_coefficient = 0.001
epsilon = 1e-5
decay_factor = 0.1

# Create operator.
node = onnx.helper.make_node(
    "Adagrad",
    inputs=["R", "T", "X", "G", "H"],
    outputs=["X_new", "H_new"],
    norm_coefficient=norm_coefficient,
    epsilon=epsilon,
    decay_factor=decay_factor,
    domain=AI_ONNX_PREVIEW_TRAINING_DOMAIN,
)

# Define operator inputs.
r = np.array(0.1, dtype=np.float32)  # scalar
t = np.array(0, dtype=np.int64)  # scalar
x = np.array([1.0], dtype=np.float32)
g = np.array([-1.0], dtype=np.float32)
h = np.array([2.0], dtype=np.float32)

# Compute expected outputs of Adagrad.
x_new, h_new = apply_adagrad(
    r, t, x, g, h, norm_coefficient, epsilon, decay_factor
)

# Check results.
expect(
    node,
    inputs=[r, t, x, g, h],
    outputs=[x_new, h_new],
    name="test_adagrad",
    opset_imports=[
        onnx.helper.make_opsetid(AI_ONNX_PREVIEW_TRAINING_DOMAIN, 1)
    ],
)
```

</details>


<details>
<summary>adagrad_multiple</summary>

```python
# Define operator attributes.
norm_coefficient = 0.001
epsilon = 1e-5
decay_factor = 0.1

node = onnx.helper.make_node(
    "Adagrad",
    inputs=["R", "T", "X1", "X2", "G1", "G2", "H1", "H2"],
    outputs=["X1_new", "X2_new", "H1_new", "H2_new"],
    norm_coefficient=norm_coefficient,
    epsilon=epsilon,
    decay_factor=decay_factor,
    domain=AI_ONNX_PREVIEW_TRAINING_DOMAIN,
)

# Define operator inputs.
r = np.array(0.1, dtype=np.float32)  # scalar
t = np.array(0, dtype=np.int64)  # scalar

x1 = np.array([1.0], dtype=np.float32)
g1 = np.array([-1.0], dtype=np.float32)
h1 = np.array([2.0], dtype=np.float32)

x2 = np.array([1.0, 2.0], dtype=np.float32)
g2 = np.array([-1.0, -3.0], dtype=np.float32)
h2 = np.array([4.0, 1.0], dtype=np.float32)

# Compute expected outputs of Adagrad.
x1_new, h1_new = apply_adagrad(
    r, t, x1, g1, h1, norm_coefficient, epsilon, decay_factor
)
x2_new, h2_new = apply_adagrad(
    r, t, x2, g2, h2, norm_coefficient, epsilon, decay_factor
)

# Check results.
expect(
    node,
    inputs=[r, t, x1, x2, g1, g2, h1, h2],
    outputs=[x1_new, x2_new, h1_new, h2_new],
    name="test_adagrad_multiple",
    opset_imports=[
        onnx.helper.make_opsetid(AI_ONNX_PREVIEW_TRAINING_DOMAIN, 1)
    ],
)
```

</details>


### <a name="ai.onnx.preview.training.Adam"></a><a name="ai.onnx.preview.training.adam">**ai.onnx.preview.training.Adam**</a>

  Compute one iteration of Adam, a stochastic gradient based optimization
      algorithm. This operator can conduct the optimization of multiple tensor variables.

      Let's define the behavior of this operator. First of all, Adam requires
      some parameters:

       - The learning-rate "R".
       - The update count "T". That is, the number of training iterations conducted.
       - A L2-norm regularization coefficient "norm_coefficient".
       - A small constant "epsilon" to avoid dividing-by-zero.
       - Two coefficients, "alpha" and "beta".

      At each Adam iteration, the optimized tensors are moved along a direction
      computed based on their exponentially-averaged historical gradient and
      exponentially-averaged historical squared gradient. Assume that only a tensor
      "X" is being optimized. The rest of required information is

       - the value of "X",
       - "X"'s gradient (denoted by "G"),
       - "X"'s exponentially-averaged historical gradient (denoted by "V"), and
       - "X"'s exponentially-averaged historical squared gradient (denoted by "H").

      Some of those parameters are passed into this operator as input tensors and others
      are stored as this operator's attributes. Specifically, this operator's input tensor
      list is ["R", "T", "X", "G", "V", "H"]. That is, 