{"title": "AG: Imperative-style Coding with Graph-based Performance", "book": "Proceedings of Machine Learning and Systems", "page_first": 389, "page_last": 405, "abstract": "There is a perceived trade-off between machine learning code that is easy to write, and machine learning code that is scalable or fast to execute. In machine learning, {\\em imperative} style libraries like Autograd and PyTorch are easy to write, but suffer from high interpretive overhead and are not easily deployable in production or mobile settings. {\\em Graph-based} libraries like TensorFlow and Theano benefit from whole-program optimization and can be deployed broadly, but make expressing complex models more cumbersome. We describe how the use of staged programming in Python, via source code transformation, offers a midpoint between these two library design patterns, capturing the benefits of both. A key insight is to delay all type-dependent decisions until runtime, via dynamic dispatch. We instantiate these principles in AG, a software system that improves the programming experience of the TensorFlow library, and demonstrate usability improvements with no loss in performance compared to native TensorFlow graphs. We also show that our system is backend agnostic, and demonstrate targeting an alternate IR with characteristics not found in TensorFlow graphs.\n", "full_text": "                       AUTOGRAPH: IMPERATIVE-STYLE CODING WITH GRAPH-BASED\r\n                                                                PERFORMANCE\r\n               DanMoldovan1 JamesMDecker2 FeiWang2 AndrewAJohnson1 BrianKLee1 ZacharyNado1 DSculley1\r\n                                                      Tiark Rompf2 AlexanderBWiltschko1\r\n                                                                     ABSTRACT\r\n                    There is a perceived trade-off between machine learning code that is easy to write, and machine learning code\r\n                    that is scalable or fast to execute. In machine learning, imperative style libraries like Autograd and PyTorch are\r\n                    easy to write, but suffer from high interpretive overhead and are not easily deployable in production or mobile\r\n                    settings. Graph-based libraries like TensorFlow and Theano bene\ufb01t from whole-program optimization and can be\r\n                    deployed broadly, but make expressing complex models more cumbersome. We describe how the use of staged\r\n                    programming in Python, via source code transformation, offers a midpoint between these two library design\r\n                    patterns, capturing the bene\ufb01ts of both. A key insight is to delay all type-dependent decisions until runtime,\r\n                    similar to dynamic dispatch. We instantiate these principles in AutoGraph, a software system that improves the\r\n                    programming experience of the TensorFlow library, and demonstrate usability improvements with no loss in\r\n                    performance compared to native TensorFlow graphs. We also show that our system is backend agnostic, targeting\r\n                    an alternate IR with characteristics not found in TensorFlow graphs.\r\n               1   PROGRAMMINGPARADIGMSFOR                                    code directly, building up a representation of the user\u2019s pro-\r\n                   MACHINELEARNING                                            gram incrementally for either automatic differentiation or\r\n                                                                              compilation. TensorFlowalsosupportsimperative-stylecod-\r\n               Programming platforms specialized for machine learning         ing via \u201ceager execution\u201d, where user-written Python code\r\n              (ML) are undergoing widespread adoption, as ML mod-             immediately executes TensorFlow kernels, without a graph\r\n               els such as neural networks demonstrate state-of-the-art       being built. Such systems allow the user to enjoy the bene\ufb01t\r\n               performance on many important industrial problems like         of traditional imperative coding, but have reduced opportu-\r\n               translation and image recognition. In order to support this    nities for program optimization, scalable computation and\r\n               proliferation of use, there has been rapid development of      portability.\r\n               platforms for building new ML models. These platforms          Thedifferences between these approaches are especially ap-\r\n               follow two main paradigms, graph-based programming and         parent for models that require data-dependent control \ufb02ow,\r\n               imperative programming. These have also been labeled           such as conditionals or loops, which are important for state\r\n              De\ufb01ne-and-run and De\ufb01ne-by-run (Tokui et al., 2015b).           of the art methods in Reinforcement Learning, sequence-\r\n              Graph-based systems like TensorFlow and Theano use a            based models, and many other emerging research areas.\r\n               high-level language (typically Python) to metaprogram a        Imperative platforms allow a user to write idiomatic and\r\n               lower-level intermediate representation (IR) of computation    native Python control \ufb02ow, using traditional syntax for data-\r\n              (Abadi et al., 2016; Al-Rfou et al., 2016). In TensorFlow\u2019s     dependent control \ufb02ow operations such as conditionals and\r\n               case, this IR provides a representation that can then be au-   loops. However, this approach reduces opportunities for\r\n               tomatically distributed across a datacenter, executed on ac-   whole-programoptimization and requires retracing on every\r\n               celerator hardware like GPUs or TPUs, deployed to mobile       execution for automatic differentiation. Graph-based plat-\r\n               devices or web servers, and can bene\ufb01t from whole-program      forms avoid this issue, but do not allow traditional Python\r\n               optimization. The computational gains are signi\ufb01cant, but      syntax for data-dependent control \ufb02own, and instead re-\r\n               comeatthecost of additional cognitive load for developers.     quire any data-dependent control \ufb02ow to be expressed in a\r\n              Imperative programming systems like PyTorch and Auto-           functional form. This is required because Python does not\r\n               grad (Paszke et al., 2017; Maclaurin et al., 2015) run user    natively support deferring the execution of control \ufb02ow.\r\n                  1Google Brain, Cambridge, MA 2Purdue University. Corre-     While graph-based and imperative programming are of-\r\n               spondence to: Tiark Rompf <tiark@purdue.edu>, Alexander B      ten presented as orthogonal and independent programming\r\n              Wiltschko <alexbw@google.com>.                                  paradigms, we provide an approach that offers the best of\r\n                                                                              both, retaining imperative usability bene\ufb01ts while still yield-\r\n                                                                              ing graph-based performance and portability bene\ufb01ts. We\r\n                                           AutoGraph: Imperative-style Coding with Graph-based Performance\r\n               note that this approach assumes the ability to transform        numberofimportant differences, most notably the lack of\r\n               code into a specialized IR, and that this IR confers real       staging beyond shape propagation on a dynamically-shaped\r\n               bene\ufb01ts to the programmer such as speed, memory and nu-         graph. AmorecompletecomparisonofTorchScriptandAu-\r\n               merical stability optimizations, as well as deployability to    toGraph can be found in Section 10. The Myia system (van\r\n               a variety of platforms. However, like many IRs, we also         Merrienboer et al., 2018) provides a similar facility Torch\r\n               assumethatit is cumbersome to program directly. Due to its      Script, where the user expresses numeric code in Python\r\n               widespread usage and robust IR, we focus much of our dis-       which is then parsed into a graph-based IR distinct from the\r\n               cussion on TensorFlow graphs, but show in our evaluation        PythonAST.JANUS(Jeongetal.,2019)operateslikeaJIT\r\n               (Section 9.1) that this approach is completely independent      compiler from Python bytecode to TensorFlow graph code,\r\n               of any back-end, and indeed, we can represent programs          modifying the Python interpreter. In contrast, AutoGraph\r\n               not easily expressible in TensorFlow\u2019s IR by selecting a        works as a stand-alone library performing source-to-source\r\n               different back-end for our code generation engine to target.    transformations.\r\n               Thecontributions of this paper are as follows:                  Providing easier deferred execution using staged program-\r\n                  \u2022 We propose a new methodology that provides users           ming or multiple dispatch has a long history. Notable ex-\r\n                    the expressive power of imperative ML systems, while       amples include Lightweight Modular Staging\u2019s type-based\r\n                    retaining the performance and portability of graph-        deferred execution model (Rompf & Odersky, 2010), the\r\n                    based systems.                                             paired use of Lua and Terra to stage high-performance nu-\r\n                  \u2022 We demonstrate this methodology in Python using            merical code (DeVito et al., 2013), and Julia\u2019s multiple\r\n                    static analyses and source code transformations (SCT).     dispatch system (Bezanson et al., 2012). Libraries imple-\r\n                  \u2022 Using these analyses and code transforms, we enable        menting or using code rewriting in Python have been in\r\n                    staged programming in Python dispatching on runtime        limited use, including the privacy- and con\ufb01dentiality-aware\r\n                    type information, in most cases requiring no additional    Jeeves system (Yang et al., 2016), which relies on MacroPy\r\n                    annotations.                                               (Haoyi et al.), as well as the Hy system, a Lisp dialect em-\r\n                  \u2022 We use our system, called AutoGraph, to convert id-        bedded in Python (Hy Contributers, 2018). However, each\r\n                    iomatic Python into TensorFlow Graph IR. We show           of these approaches alone, without substantial modi\ufb01cation,\r\n                    that AutoGraph generalizes to target other back-ends,      are inappropriate for the Python language.\r\n                    andcanconvertPythoncodeintotheLanternIR,which              Other efforts contributed a variety of ML frameworks with\r\n                    supports features absent from the TensorFlow Graph         different features. Lantern (Wang & Rompf, 2018; Wang\r\n                    IR, such as re-entrant function calls.                     et al., 2018) applied concepts of programming languages\r\n                  \u2022 Wedemonstrate that our system allows a user to easily      research (delimited continuations and multi-stage program-\r\n                    express complex ML programs that lower to an opti-         ming) to implement an expressive graph-based ML frame-\r\n                    mized IR, and run as fast as hand-written alternatives.    work. Tangent (van Merrinboer et al., 2017) performs auto-\r\n                                                                               matic differentiation using SCT. Dynet (Neubig et al., 2017)\r\n               2    RELATEDWORK                                                is a de\ufb01ne-by-run system with a dynamic batching runtime\r\n                                                                               for automated batching computations. MXNet (Chen et al.,\r\n               Anumberofexisting systems and approaches also aim to            2015) offers both options of de\ufb01ne-by-run and graph-based\r\n               provide an easy-to-use programming interface for de\ufb01n-          through the use of different syntax. Both chainer (Tokui\r\n               ing MLmodelswithoutdegrading performance. One such              et al., 2015a) and torch-autograd, a Lua port of the Auto-\r\n               example is the Open Neural-Network eXchange (ONNX)              grad library (Torch Autograd Contributors, 2018) are pure\r\n               format (ONNXContributors, 2018), which provides an IR           de\ufb01ne-by-run systems. Numba (Lam et al., 2015) translates\r\n               with APIs for many high-level front-ends that can target        annotated Python functions to machine code at runtime.\r\n               a number of popular back-ends focused on optimization\r\n               and high-performance computing. This IR is exhibited as a       3    PROGRAMMINGTENSORFLOW\r\n               computation graph, generated through the use of tracing, as\r\n               in many imperative systems. ONNX provides insight into          TheTensorFlow software programming system has become\r\n               the ability to use an IR as the broker between imperative       popular for ML practitioners, particularly those focusing\r\n               and graph-based systems, though extracting graphs via trac-     on large-scale training and deployment (Hale, 2018). ML\r\n               ing may yield a loss of control \ufb02ow information due to the      programs naturally execute in separate stages, as model\r\n               inability to capture data-dependent control \ufb02ow.                architecture and data examples become available at differ-\r\n               Another recent approach is that of PyTorch\u2019s Torch Script       ent points in a program\u2019s lifecycle, and TensorFlow makes\r\n               framework (PyTorch Contributors, 2018). While based on          these stages explicit. A TensorFlow user must \ufb01rst build\r\n               Python AST translation similar to AutoGraph, there are a        uparepresentation of the computation to be run, and then\r\n                                                                               later in their program, specify that the computation should\r\n                                                                            2\r\n                                            AutoGraph: Imperative-style Coding with Graph-based Performance\r\n               be executed. Data\ufb02ow graphs are used for this represen-            # Because Python lets us write this ...\r\n               tation, because they can be readily optimized, distributed         class Tensor(_TensorLike):\r\n               and deployed. This programming model is sometimes non-               def __add__(self, right):\r\n               obvious, leading to dif\ufb01cult usability issues and bugs, and is          return tf.add(self, right)\r\n               particularly acute in the case of specifying control \ufb02ow. For      # ... we can write this\r\n               example, some control \ufb02ow constructs should be included            import tensorflow as tf\r\n               in the lowered IR, while others are meant to specify whether       a = tf.constant(3)\r\n               or not computation should be staged into the IR. A common          b = tf.constant(4)\r\n               coding pattern is to conditionally stage computation using         c = a + b\r\n               model hyperparameters:\r\n                                                                                 This is a powerful facility in the Python language, but it\r\n                # Conditional on bool not added to graph                         unfortunately only extends to methods of objects or classes,\r\n                if HParams.nonlin == 'relu':                                     and does not include programming constructs required to\r\n                     x = tf.nn.relu(x)                                           build modern ML models. For example, the behavior of\r\n                else:                                                            conditionals cannot be overloaded in Python.\r\n                     x = tf.nn.tanh(x)\r\n               However,otherusesofcontrol\ufb02owaremeanttobeexecuted                  # We can write if statements...\r\n               in a data-dependent manner:                                        if cond:\r\n                                                                                    ans = true_fn()\r\n                # Conditional on Tensor added to graph                            else:\r\n                                                                                    ans = false_fn()\r\n                x = tf.cond(tf.reduce_sum(x) > 0,\r\n                   lambda: x * x, lambda: x)                                      # ... but we cannot overload them\r\n                                                                                  def __if__(self, cond, true_fn, false_fn):\r\n               In the code above, the conditional statement is expressed in         if cond:\r\n               a functional style so that it can be executed in-graph in a             return true_fn()\r\n               data-dependent manner. However, this clashes aesthetically           else:\r\n               and pragmatically with the imperative style of Python. This             return false_fn()\r\n               dif\ufb01culty is exacerbated when the user needs to nest con-         If overloading control \ufb02ow syntax were possible, imper-\r\n               trol \ufb02ow, or use other Python idioms like continue and            ative programs would be able to generate full representa-\r\n                break . We would instead prefer to write                         tions of user code, including previously-invisible loop and\r\n                # Conditional on Tensor - staged                                 conditional statements. Graph-based programs would not\r\n                if tf.reduce_sum(x) > 0:                                         need to require users to write their program control \ufb02ow\r\n                   x = x * x                                                     in a cumbersome functional form, because they could pro-\r\n               and have it be automatically converted into the functional        vide non-standard overrides of __if__ , __for__ and\r\n               style. We want this conversion to only occur for expres-           __while__ andotherusefulpartsofthePythonlanguage.\r\n               sions using numeric types. Conditionals switching on plain        Tocircumvent this limitation, we use SCT on whole func-\r\n               Python booleans (e.g., the hyperparameter example above)          tions to enable overloading non-local parts of the Python\r\n               should be executed imperatively, without staging.                 language. Wedescribeaspeci\ufb01cinstantiationofthissystem,\r\n                                                                                 called AutoGraph which uses SCT to allow users to target a\r\n               4    EXTENDING OPERATOR OVERLOADING                               lower-level IR while still writing idiomatic Python.\r\n               In the case of TensorFlow, metaprogramming data\ufb02ow                5    STAGED PROGRAMMINGFOR\r\n               graphs can be dif\ufb01cult for complex programs, but it is made            REAL-WORLDMLSYSTEMS\r\n               easier via operator overloading. For example, the user does\r\n               not need to type out tf.add(a, b) , but instead can sim-          Using the ability to overload arbitrary Python syntax, we\r\n               ply use a + b . This is possible due to Python\u2019s ability to       built a staged programmingsystemcalledAutoGraphforim-\r\n               allow the programmer to overload a subset of the language.        proving the performance of imperative-style ML programs\r\n               Python\u2019s approach to operator overloading allows custom           andconversely, the simplicity of graph-based ML programs.\r\n               classes, like the Tensor type in TensorFlow, to override          AutoGraph allows users to program using idiomatic and\r\n               somedefault functionality, like their behavior when used in       imperative-style Python, but still bene\ufb01t from the advan-\r\n               binary operators (e.g. +, ,-,\\%,/,\u02c6,\u02dc ) or item access.1\r\n                                         *                                       tages of TensorFlow graphs, and is exposed to users via a\r\n                  1See   Python   Language    Reference   (https://docs.         single-function API, as a Python function decorator as seen\r\n               python.org/3/reference/),Section3.3.                           3 in Listing 1.\r\n                                           AutoGraph: Imperative-style Coding with Graph-based Performance\r\n               import autograph as ag                                           def if_stmt(cond, body, orelse):\r\n                                                                                  if is_tensor(cond):\r\n               # AutoGraph converts whole                                            return tf.cond(cond, body, orelse)\r\n               # functions via a decorator...                                     elif cond:\r\n               @ag.convert()                                                         return body()\r\n               def f(x):                                                          else:\r\n               if x > 0:                                                             return orelse()\r\n                  x = x * x                                                    Listing 2: Simpli\ufb01ed version of AutoGraph\u2019s conditional\r\n               return x\r\n                                                                               statement override.\r\n               # ... into a form where control flow\r\n               # and other idioms are overloadable\r\n               def new_f(x):                                                   decision making, as it is analogous to dynamic method\r\n                  def if_true():                                               dispatchcommoninobjectorientedprogramming. Dynamic\r\n                     x_1 = x                                                   dispatch critically allows us to seamlessly switch between\r\n                     x_1 = x_1 * x_1\r\n                     return x_1                                                twocommonusesofcontrol\ufb02owinMLcode\u2013a\u201cmacro-\r\n                  def if_false():                                              programming\u201d mode that switches or loops on the value of\r\n                     return x                                                  hyperparameters, and a data-dependent mode, where the\r\n                  x = ag.if_stmt(                                              control \ufb02ow is lowered into the target IR.\r\n                        ag.gt_(x, 0), if_true, if_false)\r\n                     return x                                                  The same logic is applied to for and while loops in\r\n               Listing 1: AutoGraph automatically converts the code on         the equivalent of ag.for_stmt and ag.while_stmt\r\n               the top into the code on the bottom (simpli\ufb01ed example).        functions. We also provide functionality for overriding the\r\n                                                                                print statement, which is ordinarily incompatible with\r\n                                                                               TensorFlow graphs, since print would log information\r\n               AutoGraphworkswithcontrol\ufb02ow,suchas if , for and                immediately, and we instead want to log values at graph\r\n               while statements, even if they are arbitrarily nested or        runtime.\r\n               contain break and continue statements.                          Note that some native Python constructs, like break and\r\n               TheAutoGraphsystemcanoverload conditionals and loops             continue statements have no direct representation in\r\n               via SCT, allowing us to deviate from Python\u2019s default behav-    TensorFlow. This requires code transformations which en-\r\n               ior. Note that, using the same style of SCT, we may choose      tirely remove these statements without affecting program\r\n               to overload some statements while preserving Python se-         semantics. This is achieved by lowering the respective state-\r\n               mantics for others. Because of this, we anticipate that this    ments into equivalent TensorFlow constructs. For example,\r\n               might be a tool of general interest to Python developers, or     continue islowered using extra variables and condition-\r\n               a feature that new language implementations might want to       als.\r\n               consider including. In order to transparently support control   Thedynamicdispatchapproach incurs extra runtime over-\r\n               \ufb02owthat is meant to either be staged or unstaged in Tensor-     head. Indeed, if AutoGraph was used to perform normal\r\n               Flow, as in the conditional examples in Section 3, we must      unstaged Python computation, it would be slower. However,\r\n               change the behavior of if statements based on the type of       because we target a lower-level IR that can be executed sep-\r\n               the boolean predicate.                                          arately from the Python runtime, this overhead is amortized.\r\n               6   \u201cDYNAMICDISPATCH\u201dENABLES                                    General Approach The conversion of a function pro-\r\n                   STAGED PROGRAMMINGINPYTHON                                  ceeds as follows:\r\n               Given that we can enable overloadable control \ufb02ow in              1. Read the source code of the function and obtain its\r\n               Python, we can rede\ufb01ne its default behavior by writing               closure variables, if they are available.\r\n               a non-default implementation of ag.if_stmt . In the case          2. Parse the source code into a Python AST, abstracting\r\n               that a Python boolean is used as the predicate of a con-             awaysmalldifferences between Python versions.\r\n               ditional, we would want to execute the conditional with           3. Transform the source code in multiple passes, with\r\n               normal semantics. However, if a TensorFlow Tensor is sup-            each pass consisting of two major steps:\r\n               plied, or some other specialized numeric type, we would               (a) Static analysis, detailed below. The AST is anno-\r\n               want to stage more specialized code. A simpli\ufb01ed version                  tated with additional information that the actual\r\n               of ag.if_stmt is shown in Listing 2.                                      transformation may use.\r\n                                                                                     (b) ASTtransformations, where each transform han-\r\n               Weusethetermdynamicdispatchtodescribethis runtime 4                       dles a speci\ufb01c Python idiom. The speci\ufb01c trans-\r\n                                          AutoGraph: Imperative-style Coding with Graph-based Performance\r\n                        formations are detailed below.                       ment. Only direct modi\ufb01cations are considered writes. For\r\n                4. Serialize the \ufb01nal AST into output code.                  example, in the statement a.b = c , a.b is considered\r\n                5. Load the new output code in as a Python function,         to be modi\ufb01ed, but a is not. The activity analysis also\r\n                   and dynamically attach symbols corresponding to the       keeps track of lexical scopes, their nesting relationships (e.g.\r\n                   original function\u2019s closure variables.                    the parent scope) and the symbols they include.\r\n              Comparison with Static Methods         It is possible to ex-   Reaching De\ufb01nitions Analysis       This standard data\ufb02ow\r\n              tract computation graphs from Python code statically, but      analysis annotates help identify the de\ufb01nition that reaches\r\n              doing so requires a strict set of constraints. Systems like    each name. Additionally, the list of symbols de\ufb01ned on\r\n              Torch Script (PyTorch Contributors, 2018) elect to impose      entry of certain statements is also annotated.\r\n              these constraints in the form of a DSL which is a limited\r\n              subset of Python. A major design decision in AutoGraph,        Liveness Analysis     This standard data\ufb02ow analysis identi-\r\n              however, is to allow users access to as much of the origi-     \ufb01essymbolsthatareliveuponentryintoorexitfromcertain\r\n              nal Python interface as is possible (we discuss limitations    statements, including compound statements like condition-\r\n              to this in Section 10). Furthermore, due to binding-time       als.\r\n              analysis, relying wholly on static methods disallows staged\r\n              programming in Python without requiring some form of           7.2   CodeConversionPasses\r\n              an ersatz static type system (e.g., static type annotations).\r\n              Whileenablingstagedprogramminginadynamicsettingfor             AutoGraph performs code conversion using an extensible\r\n              arbitrary types does require careful consideration (Decker     system of multiple, typically independent, AST conversion\r\n              et al., 2019), our decision to primarily target TensorFlow as  passes. For example, one conversion pass rewrites the if\r\n              a back-end signi\ufb01cantly alleviates some of the implemen-       statements into an overloadable functional form. Another\r\n              tation pains due to the central focus of an array-based type   pass lowers the break statements into new loop predicates\r\n              (tensor). We discuss this in detail in Section 7.              and extra conditionals. This mechanism facilitates adding\r\n                                                                             support for more Python idioms in time.\r\n              7    CODEANALYSESANDCONVERSION                                 Currently, the transformations include the following, in or-\r\n              Only a subset of Python can be trivially converted, and        der of application:\r\n              substantial rewriting of user-provided code is necessary to    Directives   Identi\ufb01es calls to speci\ufb01c functions that serve\r\n              enable the overloading required for staged programming.        as AutoGraph compilation directives and annotates the\r\n              For example, loops and conditionals need to be rewritten       relevant AST nodes. An example of such a directive is\r\n              in functional form; nonlocal control \ufb02ow statements need        ag.set_loop_options .\r\n              to be lowered. We perform these rewrites with the aid of\r\n              data\ufb02ow and other analyses of program structure. We also       Break, Continue and Return Statements         These are ac-\r\n              separate these rewrites into multiple specialized passes.      tually three separate passes, but are very similar in nature.\r\n              7.1   Data\ufb02owAnalysis                                          In each case, the corresponding statement is lowered into\r\n                                                                             conditionals or expanded loop conditions.\r\n              Eachspecialized pass is preceded by several data\ufb02ow analy-\r\n              sis passes. These are described below, in the order that they   # Before conversion\r\n              are run.                                                        if cond:\r\n                                                                                 return f(x)\r\n                                                                              return g(x)\r\n              Control Flow Graph Construction          A standard intra-      # After conversion\r\n              procedural control \ufb02ow graph (CFG) supports several static      if cond:\r\n              analyses.                                                          return_value = f(x)\r\n                                                                              else:\r\n              Quali\ufb01ed Name Resolution        We create the abstraction          return_value = g(x)\r\n              of quali\ufb01ed names to extend the notion of symbols to            return return_value\r\n              include compound names such as a.b . For example,\r\n              the quali\ufb01ed name a.b roughly corresponds to the AST:          Assert Statements     These are converted in-place to over-\r\n               Attribute(name=Name('a'), attr='b') .                         loadable functional form.\r\n              Activity Analysis    Here we annotate AST nodes with the       Lists   List idioms, including list literals and the append\r\n              list of symbols read and modi\ufb01ed by the respective state- 5 and pop function calls are overloaded with custom func-\r\n                                             AutoGraph: Imperative-style Coding with Graph-based Performance\r\n                tions (e.g.  ag.list_append and ag.list_pop ) that                  of the conditional operators always sets the symbols that\r\n                allow staging the respective operation.                             the conditional may modify in either branch. To simulate\r\n               Array computations require an additional idiom not present           the unde\ufb01ned semantics, we use a special value to reify the\r\n                in the standard Python library: the stack operation. Au-            \u201cunde\ufb01ned\u201d state of a variable. This currently deviates from\r\n                toGraph provides the ag.stack function which can be                 Python semantics, but we plan to remedy this by verifying\r\n                overloaded in a manner consistent with the other overloads.         and explicitly deleting \u201cunde\ufb01ned\u201d symbols before they are\r\n                Note that list access (e.g. l[i] ) and mutation are deferred        used.\r\n                to a separate conversion pass which covers slice operators.         The while and for loops are stateful, and their func-\r\n                                                                                    tional form requires functions whose arguments and return\r\n                Slices   Python does allow overloading the slice opera-             values represent the variables modi\ufb01ed inside the loop (its\r\n                tors ( __setitem__ , __getitem__ ) in user classes.                 state).\r\n                However, the slice write operation has the semantic\r\n                of mutating the target.        We rewrite slice writes to            # Before conversion\r\n                use value semantics as currently required by Tensor-                 while x > eps:\r\n                Flow. For instance, x[i] = y is converted in-place to                   x = f(x)\r\n                x = ag.setitem(x, i, y) . Slice read operations are                  # After conversion (simplified)\r\n                converted mechanically.                                              def loop_test(x):\r\n                                                                                        return x > eps\r\n                Function Calls     All function calls are overloaded. The            def loop_body(x):\r\n                                                                                        return f(x)\r\n                overload will either dynamically convert the target function,        x = ag.while_stmt(\r\n                call it as-is or replace it with a new function, depending              loop_test, loop_body, (x,))\r\n                on the characteristics of the function being called and the\r\n                con\ufb01guration of the conversion. For example, the built-             The for statement is handled similarly. Similar to if\r\n                in function print may be converted to tf.print (see                 statements, while and for loops may de\ufb01ne symbols\r\n               Appendix E for details).                                             inside their body. If the loop body never executes, those\r\n                                                                                    symbols will remain unde\ufb01ned. This is also handled by\r\n                # Before conversion                                                 using special \u201cunde\ufb01ned\u201d values for the symbols that are\r\n                def f(a, x):                                                        not de\ufb01ned (as identi\ufb01ed by liveness analysis) upon entry\r\n                   return a(x)                                                      into the loop.\r\n                # After conversion (simplified)                                     The overloaded control \ufb02ow uses dynamic dispatch (see\r\n                def f(a, x):                                                        Appendix E).\r\n                   return ag.converted_call(a, x)\r\n                Control Flow      This conversion pass replaces all local con-      Ternary Conditional Expressions          Theternary operator\r\n                trol \ufb02ow with an overloadable equivalent functional form.            x if cond else y isconverted inline to the functional\r\n               The if statement is stateless, therefore its functional form         form ag.if_stmt(cond, x, y) .\r\n                can be expressed using niladic functions that return all the        Logical Expressions       Binary and unary logical expres-\r\n               variables modi\ufb01ed inside the statement.                              sions can be handled using traditional operator overloading\r\n                # Before conversion                                                 (e.g.  __lt__ for the < operator). However, Tensor\r\n                if x > 0:                                                           does not support all operators for compatibility reasons (for\r\n                   x = x * x                                                        example, __eq__ is not supported). Therefore we replace\r\n                # After conversion (simplified)                                     certain binary and unary operators inline with overloadable\r\n                def true_fn():                                                      functional forms. For example, a and b is replaced with\r\n                   return x * x                                                      ag.and_(a, b) .\r\n                def false_fn():\r\n                   return x\r\n                x = ag.if_stmt(x > 0, true_fn, false_fn)                            Function Wrappers        This conversion pass wraps the en-\r\n                                                                                    tire block of functions with additional boilerplate code. This\r\n                Note that Python allows to de\ufb01ne (i.e., assign for the \ufb01rst         accommodates for examples the necessary calls to create a\r\n                time) symbols inside the body of control \ufb02ow statements             TensorFlow name scope, which improves the readability of\r\n                and use them later. It is possible to write code where sym-         the rendered graph. In addition, the function wrappers con-\r\n                bols may be unde\ufb01ned based on whether the branch of a               tain specialized error handlers that intercept certain errors\r\n                conditional executed or not. However, the functional version 6 to improve usability.\r\n                                           AutoGraph: Imperative-style Coding with Graph-based Performance\r\n               8    BEYONDTENSORFLOW: ALTERNATE                                 With the modi\ufb01cations in place which allow us to target\r\n                    BACK-ENDS                                                   Lantern, this will generate the following Python code (sim-\r\n                                                                                pli\ufb01ed for presentation):\r\n               If TensorFlow is the only back-end of this code transfor-\r\n               mation, then the limitations of TensorFlow must also ap-          def run(base, tree):\r\n               ply to AutoGraph. However, due to the nature of meta-                   def tree_prod(base, tree):\r\n               programming, the SCT in AutoGraph can easily be used                         def true_fn():\r\n               to target a variety of back-ends. As previously discussed,                         return base\r\n               one shortcoming of TensorFlow is the inability to handle                     def false_fn():\r\n               re-entrant in-graph functions, and by extension, recursive                         l = __call_staged(tree_prod,\r\n               models. In order to showcase the utility of a general purpose                                 base, tree.left)\r\n               SCTmethodologyasimplementedbyAutoGraph,weelect                                     r = __call_staged(tree_prod,\r\n               to target a new ML framework prototype called Lantern                                         base, tree.right)\r\n                                                                                                  return l * r * tree.value\r\n               (Wang&Rompf,2018;Wangetal.,2018),whichiscapable                              ag.if_stmt(tree.is_empty,\r\n               of generating graphs describing recursive models.                                        true_fn, false_fn)\r\n                                                                                       __def_staged(tree_prod, base, tree)\r\n               TheLanternIR TheLanternback-endconvertsLisp-like                        return __call_staged(tree_prod, base,\r\n                                                                                                                    tree)\r\n               S-expressions describing numeric operations into ef\ufb01cient\r\n               C++code. Critically, Lantern supports programming fea-\r\n               tures absent in the TensorFlow graph speci\ufb01cation, like func-    Note that in order to correctly generate the staged function,\r\n               tion recursion and in-line function de\ufb01nitions, which are         __def_staged mustbepassedtheargumentswhichwill\r\n               essential in some state-of-the-art ML language models. We        eventually be passed to the function being de\ufb01ned. Running\r\n               demonstrate the generality of AutoGraph by targeting the         this generates S-Expression code, which is then fed as input\r\n               Lantern S-expression IR, which is supported by additional        to Lantern, which performs some internal computations and\r\n               code conversion passes.                                          eventually generates and executes the following C++ code:\r\n               Staging Functions and Recursion           In order to deal        double Snippet(double base, Tree tree) {\r\n               with functions in our model, we introduce two new                    auto rec = [&](Tree tree,\r\n                                                                                    function<double(double)> cont,\r\n               functions:   __def_staging(function, *args) and                      double base) {\r\n                __call_staging(function, args) . These emit a                          double grad = 0.0;\r\n                                                   *                                   if (!tree.is_empty) {\r\n               function de\ufb01nition or call, respectively, in the generated                 auto cont_l = [&](double x1) {\r\n               S-Expression. Due to the deferred API presented by Au-                       double sub_grad = 0.0;\r\n               toGraph, we have the ability to specialize the generated                     auto cont_r = [&](double x2) {\r\n               functions in the S-Expression IR with respect to known pa-                      double x3 = tree.value;\r\n               rameters. Note that this specialization in function calls/de\ufb01-                  double x4 = cont(x1 * x2 * x3);\r\n               nitions requires no additional modi\ufb01cations, as it is handled                   double x5 = x3 * x4;\r\n               using the existing dispatching and overloading mechanisms                       sub_grad += x2 * x5;\r\n                                                                                               return x1 * x5;\r\n               present in AutoGraph. With the ability to de\ufb01ne and call                     };\r\n               functions in the generated computation graph, this provides                  grad += rec(tree.R, cont_r, base);\r\n               the interface necessary for de\ufb01ning and running recursive                    return sub_grad;\r\n               models.                                                                    };\r\n                                                                                          grad += rec(tree.L, cont_l, base);\r\n               Todemonstrate this, we provide an end-to-end example of                 } else\r\n               Python \u2192 S-Expr \u2192 C++. We \ufb01rst examine a recursive                         grad += cont(base);\r\n               function in Python, as follows:                                         return grad;\r\n                                                                                    };\r\n                                                                                    return rec(tree,\r\n                @ag.convert()                                                          [&](auto x){return 1.0;}, base);\r\n                def tree_prod(base, tree):                                       }\r\n                     if not tree.is_empty:\r\n                           l = tree_prod(base, tree.left)                       Asshown,staging a recursive function requires that the gen-\r\n                           r = tree_prod(base, tree.right)                      erated C++ code also be recursive (as noted by the rec\r\n                           return l * r * tree.value                            function). We note that the generated C++ code looks fairly\r\n                     else:                                                      complicated, due to the handling of back-propagation. Back-\r\n                           return base                                          propagation is implemented via callbacks (seen as contin-\r\n                                                                             7 uations, noted by cont , cont_l , and cont_r in the\r\n                                                                       AutoGraph: Imperative-style Coding with Graph-based Performance\r\n                        code), the details of which can be referenced in Wang &                                                                Table 1. RNN Cell Performance (1K examples/sec)\r\n                        Rompf(2018); Wangetal. (2018).                                                                                Sequence Size                SeqSize: 64                              SeqSize: 128\r\n                                                                                                                                         Batch Size       32            64           128            32            64           128\r\n                                                                                                                                              Eager   0.82 \u00b10.08    1.57 \u00b10.13    2.04 \u00b10.14   0.43 \u00b10.03    0.76 \u00b10.05    1.04 \u00b10.06\r\n                        9       EVALUATION                                                                                                  Of\ufb01cial   2.88 \u00b10.11    3.63 \u00b10.13    5.13 \u00b10.15   1.44 \u00b10.04    1.91 \u00b10.06    2.61 \u00b10.05\r\n                                                                                                                                       Handwritten    2.95 \u00b10.13    3.71 \u00b10.15    5.24 \u00b10.11   1.52 \u00b10.06    1.96 \u00b10.07    2.68 \u00b10.03\r\n                        Wetested the utility of AutoGraph on several axes. First,                                                        AutoGraph    2.72 \u00b10.09    3.61 \u00b10.12    5.05 \u00b10.10   1.37 \u00b10.04    1.86 \u00b10.06    2.59 \u00b10.04\r\n                        weaskedwhether AutoGraph could improve the readability\r\n                        of MLcodethat relied on data-dependent control \ufb02ow with-\r\n                        out incurring a performance penalty. Second, we tested if                                                   sizes and the sequence length. Five warm-up runs were\r\n                        AutoGraph could be used to move computation usually left                                                    executed, and the mean and standard deviation of the 100\r\n                        outside of the TensorFlow graph, such as the entire train-                                                  following runs are reported. For all examples, each run is ex-\r\n                        ing process of stochastic gradient descent (SGD), inside                                                    ecuted as one tf.Session.run() call. All benchmarks\r\n                        the graph IR. Third, we tested if AutoGraph could be used                                                   were run on a dual-threaded 6-core Intel Xeon E5-1650\r\n                        to produce performant code using features not supported                                                     CPU.TheuseofAutoGraphimprovesthereadability of the\r\n                        in the TensorFlow graph by targeting alternative IRs. We                                                    code and has a very minor effect on performance.\r\n                        also prepared additional samples of more complex algo-\r\n                        rithms, including Neural Model Translation with Attention,                                                  In-Graph Training                      Typically, a TensorFlow graph rep-\r\n                        Sequence-to-sequence, MAML metalearning and L-BFGS                                                          resenting a single training step is executed repeatedly in a\r\n                        optimizations. These can be found in Appendix D.                                                            Python training loop outside of TensorFlow. This method is\r\n                        RNNcells Thecodesnippetbelowisanimplementation                                                              usedbecauseofthedif\ufb01cultyofusingcontrol\ufb02owoperators\r\n                        of an RNN model that on simple inputs produces results                                                      within TensorFlow graphs, and incurs additional computa-\r\n                        identical to TensorFlow\u2019s built-in tf.dynamic_rnn func-                                                     tional overhead. Here, we use AutoGraph to demonstrate\r\n                        tion and runs at similar speed.                                                                             a training loop that is implemented entirely as a compu-\r\n                                                                                                                                    tation graph. We trained a single linear layer on MNIST\r\n                          def dynamic_rnn(rnn_cell, input_data,                                                                     with stochastic gradient descent (SGD), and compared its\r\n                              initial_state, sequence_len=None):                                                                    performance with several other implementations. The \ufb01rst\r\n                              input_data = tf.transpose(input_data,                                                                 approach was TensorFlow Eager, an imperative execution\r\n                                   (1, 0, 2))                                                                                       modeforTensorFlowsimilar to NumPy and PyTorch. The\r\n                              outputs = []                                                                                          second approach we tested was a traditional TensorFlow\r\n                              ag.set_element_type(outputs, tf.float32)                                                              training process. The third approach was an in-graph train-\r\n                              state = initial_state\r\n                              if sequence_length is None:                                                                           ing loop implemented using the TensorFlow while_loop\r\n                                   max_len = tf.shape(input_data)[0]                                                                API.\r\n                              else:\r\n                                   max_len = tf.reduce_max(sequence_len)\r\n                              for i in tf.range(max_len):                                                                                                   Table 2. Model and Training Loop\r\n                                   prev_state = state\r\n                                   output, state = rnn_cell(input_data[i],                                                                                                                             SGDSteps/sec\r\n                                       state)\r\n                                   state = tf.where(                                                                                                                                    Eager              274.1\u00b13.6\r\n                                            i < sequence_len,                                                                             ModelInGraph,LoopInPython                                        484.1\u00b17.7\r\n                                            state,                                                                                                  ModelAndLoopInGraph                                   646.5\u00b114.1\r\n                                            prev_state)                                                                                     ModelAndLoopInAutoGraph                                       623.5\u00b113.5\r\n                                   outputs.append(output)\r\n                              outputs = ag.stack(outputs)\r\n                              outputs = tf.transpose(outputs,\r\n                                   (1, 0, 2))                                                                                       Eachrunconsistedof1000trainingstepswithabatchsizeof\r\n                              return outputs, state                                                                                 200. Onewarm-uprunwasexecuted,andthemeanandstan-\r\n                                                                                                                                    dard deviation of the 10 following runs are reported. For the\r\n                        Compare this terse and readable implementation to the                                                       in-graph training loop examples, the entire set of 1000 train-\r\n                        equivalent graph version in Appendix A.                                                                     ing steps is executed in one tf.Session.run() call. For\r\n                                                                                                                                    the other examples, each training step is run as a separate\r\n                        We compared TensorFlow\u2019s of\ufb01cial implementation of                                                           tf.Session.run() call. Executing a single-training-\r\n                          tf.dynamic_rnn withbothahand-written, graph-based                                                         step graph repeatedly in a Python loop (the traditional ap-\r\n                        implementation, and the code snippet above converted into                                                   proach) is faster than the eager-style code by 75%. Moving\r\n                        graphs via AutoGraph. Each run consisted of an execution                                                    the entire training process into a TensorFlow graph further\r\n                        of an RNN having hidden size 256, while varying batch 8 yielded a roughly 30% speedup.\r\n                                             AutoGraph: Imperative-style Coding with Graph-based Performance\r\n                9.1   AutoGraph+Lantern: TreeLSTM                                   tested in end-to-end reference tests. Any changes to the\r\n               Weevaluated a model of TreeLSTM for Sentiment Classi-                AutoGraphsystemrequire that all unit and reference tests\r\n                \ufb01cation running on the dataset of the Stanford Sentiment            pass, and all code is manually reviewed by at least one en-\r\n               (Socher et al., 2013), following the work of (Tai et al., 2015).     gineer for correctness, readability and adherence to style\r\n               Themodelembedssentenceparse-trees by recursively em-                 guidelines. Anecdotally, this test- and review-oriented devel-\r\n                bedding the left/right sub-trees and combining the embed-           opmentpracticehascaughtmanysurprisingandsubtlebugs,\r\n                ding vectors via BiLSTM core. The embedding of whole                and allowed a library as complex as AutoGraph to remain\r\n                sentences is then passed to MLP for sentiment prediction.           relatively easy to maintain and extend. Further, we built\r\n               Themodelcanbeeasilyexpressed in PyTorch using recur-                 manyuseful utilities for manipulating Python source code\r\n                sive functions, or in AutoGraph targeting recursive functions       that simpli\ufb01ed development (described in Appendix C).\r\n                in Python. The \ufb01nal generated C++ code was compared                 Alternative Approaches for Implementing Staged Pro-\r\n                against the PyTorch implementation in terms of training             gramming Analternative approach to SCT would have\r\n                ef\ufb01ciency. To approximate a \u201creal-world\u201d running time,\r\n                this experiment was run using a single thread on a laptop           been to build a new Python interpreter with non-standard\r\n               with a dual-core AMD A9-9410 Radeon CPU @ 1.70GHz                    execution semantics for Python programs that could map\r\n                and 8GBofSODIMMSynchronous2400MHzRAM,with                           to TensorFlow graphs, and indeed, an early proposal for\r\n                Ubuntu 16.04.                                                       AutoGraph was to do exactly this. However, a non-standard\r\n                Our AutoGraph implementation of TreeLSTM targeting                  Python interpreter would require reimplementing all aspects\r\n                Lantern yielded performance approximately 2.38 times                of the Python language, including those parts that require\r\n                faster than that of the PyTorch implementation. Our system          nomodi\ufb01cations in machine learning code.\r\n                achieved approximately 36.75 SGD steps per second, com-             WecouldalsoparsePythontoourownintermediaterepre-\r\n                pared with the 15.41 steps per second using the PyTorch             sentation, a strategy taken recently by the Myia system (van\r\n                implementation. We note that we used a batch size of 1 for          Merrienboer et al., 2018). This intermediate representation\r\n                both systems due to dif\ufb01culty in batching recursive models.         could then be either back-converted to Python or executed\r\n                                                                                    in a dedicated VM. Indeed, this strategy is similar to our\r\n                                                                                    ability to work with Lantern; AutoGraph modi\ufb01es the origi-\r\n                             Table 3. TreeLSTM Targeting Lantern                    nal Python source code such that it generates S-Expressions\r\n                                 Movedtoseparate \ufb01les.      SGDSteps/sec            as an IR, which are then consumed by Lantern.\r\n                             LoopandModelinPyTorch                15.41             Ourchoice to emit Python code after conversion has several\r\n                 LoopandModelinAutoGraph/Lantern                  36.75             advantages. Unsupported code idioms are allowed to pass\r\n                                                                                    through conversion if they do not affect the program se-\r\n                                                                                    mantics. This simpli\ufb01es the support for legacy TensorFlow\r\n                10    DISCUSSION                                                    code. Further, the generated code can be inspected, and\r\n                                                                                    even modi\ufb01ed by the user.\r\n                Developing a source code transformation methodology is\r\n                far from mechanical. There exist a number of design deci-           Comparing Torch Script and AutoGraph Similar to\r\n                sions which may ultimately yield different results in terms         ONNX,PyTorch\u2019sTorchScriptframework(PyTorchCon-\r\n                of expressiveness, performance and portability. In this sec-        tributors, 2018) allows users to save models for later eval-\r\n                tion, we discuss some of these decisions and provide insight        uation, while providing an even higher-level interface\r\n                regarding how they shaped the current state of AutoGraph,           for programming: nearly native Python with two new\r\n                including its current limitations. We provide a detailed dis-       decorators. These decorators, torch.jit.trace and\r\n                cussion of error handling in AutoGraph in the Appendix B.            torch.jit.script , produce Torch Script code (a sub-\r\n                Engineering Practices as a Feature          The code conver-        set of Python used as an IR for the eventual computation\r\n                sion passes we implement in AutoGraph are non-local, and            graph) from idiomatic Python, though they accomplish this\r\n                can interact with each other in complicated ways. For in-           via different methods.\r\n                stance, converting deeply-nested for loops and if state-            The torch.jit.trace decoratorworksasthenamesug-\r\n                ments exposes data\ufb02ow interactions between each level of            gests: it extracts computation graphs through tracing. This\r\n                nesting. In order to build a reliable system, we made ex-           produces fully shape-specialized Torch Script code, which\r\n                tensive use of engineering best-practices. For instance, all        allows for highly optimized models (and an easy target for\r\n                static analyses, code transforms, and utility functions are         potential compilers). However, tracing in Torch Script has\r\n                extensively unit tested (>50% of the 22k LOC in Auto-               the same drawback as found in ONNX: as stated clearly by\r\n                Graph is tests). Further, interactions between features are 9 the Torch Script developers, \u201cTracing only correctly records\r\n                                            AutoGraph: Imperative-style Coding with Graph-based Performance\r\n               functions and modules which are not data dependent (e.g.,          generic, element access lacks type information and we may\r\n               have conditionals on data in tensors)...\u201d                          require additional user annotations when the IR is strongly\r\n               Torch Script\u2019s torch.jit.script decorator, on the                  typed, which is usually the case. More advanced type in-\r\n               other hand, will directly translate the decorated Python           ference mechanics that could obviate these annotation is a\r\n               function to Torch Script code, which does allow for data-          subject for future work.\r\n               dependent control \ufb02ow. While this seems similar to Au-             Wemakeabestefforttoguarantee that the conversion to IR\r\n               toGraph\u2019s source code transformation model (detailed in            is either semantics-preserving, or it explicitly fails. How-\r\n               Section 6), there are a number of important differences            ever, a more rigorous treatment of the correctness of our\r\n               between the two methodologies. Torch Script is inher-              system is needed. We plan to treat this both formally and\r\n               ently bound to the PyTorch runtime, which prevents the             empirically, using a random code generation fuzzing system.\r\n               use of Torch Script with any other specialized or acceler-         In the meantime, we provide as evidence of correctness an\r\n               ated ML back-ends. Furthermore, torch.jit.script                   expansive test suite for AutoGraph, containing hundreds\r\n               does all of its work at compile time, and thus the only            of tests. Furthermore, due to AutoGraph being included\r\n               view of staging available currently is the ability to do shape     in tf.function , the default way to accelerate code in\r\n               propagation on a dynamically-shaped graph (resulting from          TensorFlow 2.0, AutoGraph is also subject to all tests cov-\r\n                torch.jit.script ). This drawback comes as a result               ering the TensorFlow codebase. While this notion of test-\r\n               of the decision to target a relatively basic IR (Torch Script),    based correctness does not provide a formal guarantee of\r\n               rather than Python code. One powerful consequence of this          correctness, we note that this is consistent with other formal\r\n               decision, however, is the ability to cleanly implement au-         analyses of Python semantics (Politz et al., 2013).\r\n               tobatching on Torch Script, which is otherwise dif\ufb01cult in         Lastly, AutoGraph relies on Python introspection and re\ufb02ec-\r\n               systems targeting a broader IR.                                    tion APIs, such as inspect and imp . While these are\r\n                                                                                  available in the vast majority if use cases, there are instances\r\n               Limitations     The Python language is large, and Auto-            whenAutoGraphcannotbeused,forexamplewhensource\r\n               Graph does not stage all of it. We focus on the subset             code information is not available.\r\n               that enables machine learning programming, but we are still\r\n               missing many useful constructs, such as associative data           11     CONCLUSIONS AND FUTURE WORK\r\n               structures and try/except blocks. In some cases, there             WehavedescribedAutoGraph,astagedprogrammingsys-\r\n               is no corresponding construct in the TensorFlow or Lantern         tem for automatically rewriting idiomatic Python code\r\n               IR,butaswebuildsupportformoreIRs,weanticipatebeing                 into an equivalent lower-level IR, including TensorFlow\r\n               able to successfully convert more of the Python language.          graphs and other, more experimental, back-ends. Auto-\r\n               Although only a subset of the Python language is converted         Graph achieves a balance in the design space between im-\r\n               to TensorFlow constructs, AutoGraph does allow nearly ev-          perative and graph-based code. These two programming\r\n               ery Python construct, and will simply call it unconverted.         models \u2013 fully-imperative with high runtime overhead, and\r\n               This allows AutoGraph to be compatible with the vast ma-           fully-staged with high developer mental overhead \u2013 are not\r\n               jority of existing graph code. Appendix E exhaustively             binary choices. Using SCT, we can eliminate the distinction\r\n               documents Python language support in AutoGraph.                    betweenthetwo. Webelievethatthisapproachisapplicable\r\n               In addition, the data-dependent staging decisions made by          broadly, and are working to target a wider suite of IRs in\r\n               AutoGraph are obscured from the user, much like Python             newapplications.\r\n               operator overloading obscures computation made in the              The entirety of AutoGraph is open sourced via the\r\n               overloaded operators. For instance, if the user accidentally       TensorFlow project on GitHub at https://github.\r\n               passes a Python boolean instead of a TensorFlow boolean            com/tensorflow/tensorflow/tree/master/\r\n               to a conditional, it will not be staged into a graph, with         tensorflow/python/autograph.\r\n               potential performance implications. Currently, the user has\r\n               few tools to catch and debug this behavior. We already\r\n               provide better error messages than a system like this naively      ACKNOWLEDGEMENTS\r\n               would (see Appendix B), but further work is required.              We would like to thank Alex Passos and the rest of the\r\n               Additional challenges arise from the mismatch between              TensorFlow team for their help and support integrating Au-\r\n               PythonandtheIRstypingsystem. Forexample,TensorFlow                 toGraph into TensorFlow 2.0.\r\n               does not support nullable types, so we impose additional           The technique based on dynamic dispatch was studied in\r\n               constraints on the Python semantics by requiring that all          prior work by Josh Levenberg.\r\n               code paths initialize a variable when control \ufb02ow is staged\r\n               in TensorFlow. Similarly, because Python types like lists are 10\r\n                                          AutoGraph: Imperative-style Coding with Graph-based Performance\r\n               REFERENCES                                                       Implementation (NSDI 19), pp. 453\u2013468, Boston, MA,\r\n              Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean,       2019. USENIXAssociation. ISBN 978-1-931971-49-2.\r\n                 J., Devin, M., Ghemawat, S., Irving, G., Isard, M., et al.     URLhttps://www.usenix.org/conference/\r\n                 Tensor\ufb02ow: A system for large-scale machine learning.          nsdi19/presentation/jeong.\r\n                 In OSDI, volume 16, pp. 265\u2013283, 2016.                       Lam, S. K., Pitrou, A., and Seibert, S. Numba: A llvm-\r\n              Al-Rfou, R., Alain, G., Almahairi, A., Angermueller, C.,          based python jit compiler. In Proceedings of the Second\r\n                 Bahdanau, D., Ballas, N., Bastien, F., Bayer, J., Belikov,     Workshop on the LLVM Compiler Infrastructure in HPC,\r\n                 A., Belopolsky, A., et al. Theano: A python framework          LLVM \u201915, pp. 7:1\u20137:6, New York, NY, USA, 2015.\r\n                 for fast computation of mathematical expressions. arXiv        ACM. ISBN978-1-4503-4005-2. doi: 10.1145/2833157.\r\n                 preprint arXiv:1605.02688, 472:473, 2016.                      2833162. URLhttp://doi.acm.org/10.1145/\r\n                                                                                2833157.2833162.\r\n              Bezanson, J., Karpinski, S., Shah, V. B., and Edelman, A.       Maclaurin, D., Duvenaud, D., and Adams, R. P. Autograd:\r\n                 Julia: A fast dynamic language for technical computing.        Effortless gradients in numpy. In ICML 2015 AutoML\r\n                 arXiv preprint arXiv:1209.5145, 2012.                          Workshop, 2015.\r\n              Chen, T., Li, M., Li, Y., Lin, M., Wang, N., Wang, M., Xiao,    Neubig, G., Dyer, C., Goldberg, Y., Matthews, A., Am-\r\n                 T., Xu, B., Zhang, C., and Zhang, Z. Mxnet: A \ufb02exible          mar, W., Anastasopoulos, A., Ballesteros, M., Chiang,\r\n                 and ef\ufb01cient machine learning library for heterogeneous        D., Clothiaux, D., Cohn, T., Duh, K., Faruqui, M., Gan,\r\n                 distributed systems. CoRR, abs/1512.01274, 2015.               C., Garrette, D., Ji, Y., Kong, L., Kuncoro, A., Ku-\r\n              Decker, J. M., Moldovan, D., Wei, G., Bhardwaj,                   mar, G., Malaviya, C., Michel, P., Oda, Y., Richardson,\r\n                 V., Essertel, G., Wang, F., Wiltschko, A. B., and              M., Saphra, N., Swayamdipta, S., and Yin, P. Dynet:\r\n                 Rompf, T.     The 800 pound python in the machine              The dynamic neural network toolkit.       arXiv preprint\r\n                 learning room.     2019.    URL https://www.cs.                arXiv:1701.03980, 2017.\r\n                 purdue.edu/homes/rompf/papers/decker-                        ONNX Contributors.       Open neural network exchange.\r\n                 preprint201811.pdf.                                            https://github.com/onnx/onnx, 2018.                  Ac-\r\n                                                                                cessed: 2018-09-24.\r\n              DeVito, Z., Hegarty, J., Aiken, A., Hanrahan, P., and Vitek,    Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E.,\r\n                 J. Terra: a multi-stage language for high-performance          DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., and Lerer,\r\n                 computing. In ACM SIGPLAN Notices, volume 48, pp.              A. Automatic differentiation in pytorch. 2017.\r\n                 105\u2013116. ACM, 2013.\r\n              Finn, C., Abbeel, P., and Levine, S.       Model-agnostic       Politz, J. G., Martinez, A., Milano, M., Warren, S., Patterson,\r\n                 meta-learning for fast adaptation of deep networks.            D., Li, J., Chitipothu, A., and Krishnamurthi, S. Python:\r\n                 In Proceedings of the 34th International Confer-               the full monty. In OOPSLA, pp. 217\u2013232. ACM, 2013.\r\n                 ence on Machine Learning, ICML 2017, Sydney,                 PyTorch Contributors. Torch script. https://pytorch.\r\n                 NSW, Australia, 6-11 August 2017, pp. 1126\u20131135,               org/docs/master/jit.html, 2018. Accessed:\r\n                 2017. URL http://proceedings.mlr.press/                        2018-09-24.\r\n                 v70/finn17a.html.                                            Rompf,T.andOdersky,M. Lightweightmodularstaging:\r\n              Hale, J.   Deep learning framework power scores 2018.             a pragmatic approach to runtime code generation and\r\n                 https://towardsdatascience.com/deep-                           compiled dsls. In Acm Sigplan Notices, volume 46, pp.\r\n                 learning-framework-power-scores-2018-                          127\u2013136. ACM, 2010.\r\n                 23607ddf297a,2018. Accessed: 2018-09-25.                     Socher, R., Perelygin, A., Wu, J., Chuang, J., Man-\r\n              Haoyi,L.,Holmgren,J.,andBerti,A. Macropy. https://                ning, C. D., Ng, A., and Potts, C.       Recursive deep\r\n                 github.com/lihaoyi/macropy.Accessed: 2018-                     models for semantic compositionality over a sentiment\r\n                 09-25.                                                         treebank. In Proceedings of the 2013 Conference on\r\n                                                                                Empirical Methods in Natural Language Processing,\r\n              Hy Contributers.    Hylang.    https://github.com/                pp. 1631\u20131642. Association for Computational Lin-\r\n                 hylang/hy,2018. Accessed: 2018-09-25.                          guistics, 2013.   URL http://www.aclweb.org/\r\n              Jeong, E., Cho, S., Yu, G.-I., Jeong, J. S., Shin, D.-J., and     anthology/D13-1170.\r\n                 Chun, B.-G. JANUS: Fast and \ufb02exible deep learning via        Tai, K. S., Socher, R., and Manning, C. D. Improved seman-\r\n                 symbolicgraphexecutionofimperativeprograms. In16th             tic representations from tree-structured long short-term\r\n                 USENIXSymposiumonNetworkedSystemsDesignand 11                  memorynetworks. CoRR, abs/1503.00075, 2015.\r\n                                         AutoGraph: Imperative-style Coding with Graph-based Performance\r\n              Tokui, S., Oono, K., Hido, S., and Clayton, J. Chainer: a        def while_body(i, state, outputs):\r\n                 next-generation open source framework for deep learn-            prev_state = state\r\n                 ing. In Proceedings of Workshop on Machine Learn-                output, state = rnn_cell(\r\n                 ing Systems (LearningSys) in The Twenty-ninth Annual                  input_data[i], state)\r\n                Conference on Neural Information Processing Systems               state = tf.where(\r\n                (NIPS), 2015a. URL http://learningsys.org/                             i < sequence_len,\r\n                                                                                       state,\r\n                 papers/LearningSys_2015_paper_33.pdf.                                 prev_state)\r\n                                                                                  outputs = outputs.write(i, output)\r\n              Tokui, S., Oono, K., Hido, S., and Clayton, J. Chainer: a           return i + 1, state, outputs\r\n                 next-generation open source framework for deep learning.      def while_cond(i, state, outputs):\r\n                 In NIPS 2015 LearningSys Workshop, volume 5, 2015b.              return i < max_len\r\n                                                                               _, state, outputs = tf.while_loop(\r\n              Torch Autograd Contributors. torch-autograd. https://                 while_cond,\r\n                 github.com/twitter/torch-autograd,2018.                            while_body,\r\n                Accessed: 2018-09-25.                                               loop_vars=(tf.constant(0),\r\n                                                                                                   initial_state,\r\n              van Merrienboer, B., Breuleux, O., Bergeron, A., and Lam-                            outputs))\r\n                                                                               outputs = outputs.stack()\r\n                 blin, P. Automatic differentiation in ml: Where we are        outputs = tf.transpose(outputs, (1, 0, 2))\r\n                 and where we should be going. In Advances in neural           return outputs, state\r\n                 information processing systems, 2018.\r\n              van Merrinboer, B., Wiltschko, A. B., and Moldovan,           B ERRORHANDLING\r\n                 D.   Tangent: automatic differentiation using source       In AutoGraph, there are three distinct steps of execution in\r\n                 code transformation in python. 2017. URL https:            addition to the usual syntax veri\ufb01cation performed by the\r\n                //arxiv.org/pdf/1711.02712.pdf.                             Python runtime:\r\n              Wang, F. and Rompf, T. A language and compiler view              \u2022 Conversion\r\n                 ondifferentiable programming, 2018. URL https://              \u2022 Staging (e.g., TensorFlow graph construction)\r\n                 openreview.net/forum?id=SJxJtYkPG.                            \u2022 Runtime (e.g., TensorFlow graph execution)\r\n              Wang,F.,Wu,X.,Essertel,G.M.,Decker,J.M.,andRompf,             Thelatter two steps can be associated with the two stages\r\n                T. Demystifying differentiable programming: Shift/reset     in the multi-stage programming model that platforms like\r\n                 the penultimate backpropagator. CoRR, abs/1803.10228,      TensorFlow and PyTorch\u2019s JIT model implement. Each of\r\n                 2018.                                                      these steps has distinct requirements for error handling, but\r\n              Yang, J., Hance, T., Austin, T. H., Solar-Lezama, A., Flana-  principally make use of these two technologies:\r\n                 gan, C., and Chong, S. Precise, dynamic information           \u2022 Source map construction. Each node in the AST, even\r\n                 \ufb02owfor database-backed applications. In Proceedings             after several passes of SCT, is associated to an original\r\n                 of the 37th ACM SIGPLAN Conference on Programming               line of the user\u2019s Python code.\r\n                Language Design and Implementation, PLDI 2016, pp.             \u2022 Error rewriting. Several frames in the stack trace\r\n                 631\u2013647, New York, NY, USA, 2016. ACM.                          of TensorFlow code, especailly AutoGraph-generated\r\n                                                                                 TensorFlow code, point to lines of code written by\r\n                                                                                 the AutoGraph compiler system rather than the user.\r\n              A DYNAMICRNNIMPLEMENTATION                                         Weareabletoreassociate temporary \ufb01les (used when\r\n                                                                                 generating code in AutoGraph) to the user\u2019s original\r\n              Below is the hand-written graph implementation of the              source \ufb01les.\r\n               tf.dynamic_rnn cell.\r\n               def dynamic_rnn(rnn_cell, input_data,                        Conversion Errors     Conversion errors may occur due to\r\n                 initial_state, sequence_len=None):                         code that is otherwise legal Python, but is unsupported by\r\n                 input_data = tf.transpose(input_data,                      AutoGraph. These errors usually originate inside Auto-\r\n                    (1, 0, 2))                                              Graph internal code.\r\n                 outputs = tf.TensorArray(                                  For usability, such errors must indicate the location in the\r\n                       tf.float32, size=0, dynamic_size=True)               converted code of the idiom that caused the error. In addi-\r\n                 if sequence_length is None:\r\n                    max_len = input_data.shape[0]                           tion, the error message must provide suf\ufb01cient information\r\n                 else:                                                      to allow the developer to remedy the error. Lastly, the error\r\n                    max_len = tf.reduce_max(sequence_len)                   stack trace should avoid references to internal code, as they\r\n                                                                        12\r\n                                           AutoGraph: Imperative-style Coding with Graph-based Performance\r\n               are typically uninformative to the user.                               returned as a string.\r\n               Currently, we facilitate this requirement by generating a           \u2022  compiler.ast_to_object(ast_node)                  com-\r\n               stack-trace-like message that indicates the location of the er-        piles an AST into an equivalent Python entity, returned\r\n               ror. In the future, we plan to further improve the conciseness         as a module.\r\n               of error messages of this type.                                  For example:\r\n               Staging Errors     Staging errors can occur in successfully       node = parse_str('a = b')\r\n               converted code and are typically raised because of disal-         print(fmt(node))\r\n               lowed or invalid argument types, shapes, hyperparameter           # Output:\r\n               values or other conditions that are only detectable at run-       Module:\r\n               time. To address this, we plan to generate a stack-trace-like     | body=[\r\n               message with frames from the original code from which the         | | Assign:\r\n               intermediate code was generated. This is facilitated by the       | | | targets=[\r\n               ASTsourcemapthatwemaintainbetweeneachnodeinthe                    | | | | Name:\r\n                                                                                 | | | | | id=\"a\"\r\n               generated AST and the user\u2019s original source code.                | | | | | ctx=Store()\r\n               Another challenge is that error messages may refer to gen-        | | | | | annotation=None\r\n                                                                                 | | | ]\r\n               erated symbols or to contexts speci\ufb01c to generated code.          | | | value=Name:\r\n               Addressing this shortcoming is a subject of future work.          | | | | id=\"b\"\r\n                                                                                 | | | | ctx=Load()\r\n               Runtime Errors      The name of this class of errors refers       | | | | annotation=None\r\n               to the staged IR runtime.                                         | ]\r\n               For example, integer division by zero errors in TensorFlow:      These utilities make it easy to make small modi\ufb01cations to\r\n                                                                                the AST.\r\n                def f(n):\r\n                  return tf.constant(10, dtype=tf.int32) / n                     node = parse_str('a = b')\r\n                                                                                 node.body[0].value.id = 'c'\r\n               TheIRexecutionenvironment typically includes facilities           print(ast_to_source(node))\r\n               to trace the source of the error to user code. However, in        # Output:\r\n               the case of AutoGraph, that will be generated code. To            a = c\r\n               remedy this, we plan to intercept these errors and attach\r\n               information that helps the user further trace the source of the\r\n               error to original, pre-conversion code. We plan to enhance       TemplatedCodeRewriting Example:\r\n               the user experience with the addition of tf.function in           code_quote = '''\r\n               the TensorFlow 2.0 API.                                           def fn(args):\r\n                                                                                    body\r\n               C USEFULUTILITIES                                                 '''\r\n                                                                                 new_body = textwrap.dedent('''\r\n               In order to build the system as described, we created a large        a = x\r\n               library of source code transformationtoolsthatweanticipate           b = y\r\n               will be useful to the broader Python community.                      return a + b\r\n                                                                                 ''')\r\n                                                                                 node = templates.replace(\r\n               EasyCodeQuotingandUnquoting Afewoftheutility                         code_quote,\r\n               functions are listed below:                                          fn='my_function',\r\n                                                                                    args=('x', 'y'),\r\n                  \u2022  parser.parse_entity(fn_or_class) takes a                       body=parser.parse_str(new_body).body\r\n                    Python class or function and returns the corresponding       )\r\n                    ASTnode,wrappedinacontaining Module node.                    print(compiler.ast_to_source(node))\r\n                  \u2022  parser.parse_str(code_string) isdenticalto                  # Output:\r\n                     parse_entity ,excepttakesastringofPythoncode                def my_function(x, y):\r\n                    as input. The string may contain any valid Python code.         a = x\r\n                  \u2022  pretty_printer.fmt(ast_node)               returns    a        b = y\r\n                                                                                    return a + b\r\n                    pretty-printable string representing the AST.\r\n                  \u2022  compiler.ast_to_source(ast_node)                    un-    Thefunction inserts string symbols or AST nodes into the\r\n                    parses an AST into the equivalent Python code, 13 quoted code template, and performs additional integrity\r\n                                           AutoGraph: Imperative-style Coding with Graph-based Performance\r\n               checks. This allows for the easy construction of complicated    D.2    L-BFGS\r\n               code blocks, especially with respect to building the AST        TheL-BFGS(Limited-MemoryBroydenFletcherGoldfarb-\r\n               manually.                                                       Shannon) algorithm is often used for parameter estimation\r\n               D EXPANDEDEXAMPLES                                              in Machine Learning. Our implementation is based on the\r\n                                                                               TensorFlow Eager implementation written by Yaroslav Bu-\r\n               Weexpandonthetoyexamplesinthemainpapertoillus-                  latov3. In our benchmark, AutoGraph is almost 2 times\r\n               trate AutoGraph\u2019s utility when implementing more realistic      faster than Eager with a batch size of 10 in approximately\r\n               algorithms and models. These were implemented using Ten-        the same amount of code.\r\n                                             2\r\n               sorFlow\u2019s benchmark utilities so that they can more easily      D.3    Model-Agnostic Meta-Learning (MAML)\r\n               be run. This also allows us to compare the performance of\r\n               AutoGraphgenerated code to other reference implementa-          Model-AgnosticMeta-Learning(MAML,Finnetal.(2017))\r\n               tions both from AutoGraph\u2019s authors and distributed as part     is an algorithm for meta-learning, especially effective for\r\n               of TensorFlow. We report some preliminary \ufb01ndings for           few-shotlearning. Ourbenchmarkisbasedonthesinusoidal\r\n               each example.                                                                                     4\r\n                                                                               example from Finn et al. (2017).\r\n               All example code mentioned in this section, as well as the      WeimplementedourMAMLbenchmarkusingcodethatis\r\n               full runnable codeforexamplesfoundthroughthepaper,can           compatible with both TensorFlow Eager AutoGraph. When\r\n               be found at https://github.com/tensorflow/                      training a single meta-parameter, the AutoGraph converted\r\n               autograph/examples/sysml2019.                                   coderan1.9timesfasterthantheidenticalcoderuninEager\r\n                                                                               mode. AutoGraphconvertedcodewas2.7timesfasterwhen\r\n               D.1   BeamSearch                                                training 10 meta-parameters.\r\n               Beamsearchisanalgorithm often used in machine transla-          D.4    seq2seq\r\n               tion. The algorithm builds candidate sequences by taking\r\n               the most-likely steps at each transition, possibly discard-     The seq2seq (Sequence-to-Sequence) model5 is a general\r\n               ing less-likely sequences. This is an interesting use-case      purpose encoder and decoder that can be used for tasks like\r\n               for AutoGraph because beam search consists of complex           machine translation. We implemented this model and a\r\n               computation and decisions at each step, with the number of      benchmark that measures the performance of the model on\r\n               steps capped at a the maximum sequence size. The simplest       randominput sequences.\r\n               implementation of beam search is a loop that breaks if all      Weimplementedthis benchmark in TensorFlow Eager and\r\n               candidate sequences have terminated. More robust imple-         converted that Eager code using AutoGraph. AutoGraph\r\n               mentations will separately keep track of living and terminal    converted code was 1.18 to 3.05 times faster than the Ea-\r\n               candidate sequences, and break if no living candidate has       ger equivalent. The performance improvement varies with\r\n               the potential to outscore a terminal candidate. Breaking out    vocabulary size: AutoGraph performs better on larger vo-\r\n               of the loop is essential to the performance of beam search      cabularies. Varying sequence length from 64 to 128 had\r\n               since it often can generate sequences that are far shorter      minimal effect on the performance improvement. We also\r\n               than the maximum allowable size.                                implemented optional \u201cteacher forcing\u201d, which almost dou-\r\n               WeimplementedbeamsearchusingTensorFlowEager. Us-                bles the improvement gained from AutoGraph. This is\r\n               ingAutoGraph,thebenchmarkrunsbetween2and3.2times                because teacher-forcing reduces the amount of time spent\r\n               faster than the same code run using TensorFlow Eager. The       performing computations, so the overhead of Eager mode\r\n               improvement varies as we change the maximum sequence            is a larger percentage of the overall time. AutoGraph is\r\n               length and vocabulary size. Longer sequences and smaller        designed to reduce such overhead, in this case by embed-\r\n               vocabularies typically show more improvement when us-           ding data-dependent control \ufb02ow in the graph executed by\r\n               ing AutoGraph. Longer sequences result in more iterations       TensorFlow.\r\n               of the loop, so embedding these loops in the TensorFlow\r\n               graph with AutoGraph shows more relative improvement.           E SUPPORTEDFEATURES\r\n               Alarger vocabulary results in more expensive vector and\r\n               matrix operations, taking longer overall.                       Tables 4, 5 and 6 show the features of Python and Tensor-\r\n                  2https://www.tensorflow.org/community/                       Flowthat are presently supported by AutoGraph.\r\n               benchmarks                                                          3\r\n                                                                                    https://github.com/yaroslavvb/stuff/\r\n                                                                               tree/master/eager_lbfgs\r\n                                                                                   4https://github.com/cbfinn/maml\r\n                                                                                   5https://google.github.io/seq2seq/\r\n                                                                            14\r\n                            AutoGraph: Imperative-style Coding with Graph-based Performance\r\n                                                                               isthe\r\n                c                    e                                         loopby\r\n               con-     and          v\r\n                           must must ha                                        the\r\n               insideoperationsconsistentpathspathsmust                      semantics.triggered\r\n                        Tensor                                               e whetherfects\r\n                 sible     code code alues\r\n                 viproduceforalue aluev                                          ef\r\n               mutation    Allv allv                           tf.cond\r\n                 to      Dataset.reduce.,           op  op op                preservside\r\n                   must              returnalue.    TF  TF TF  using        release.determinesemantics.\r\n            SemanticsObjectloop loop allv                     h             2  to\r\n            w  . limitedpathsconsistentconsistent,                          TF   Python\r\n                 w                                                          innecessarilyneedyaluation\r\n                 \ufb02o                                            boolean           an      v\r\n                   code    tf.Dataset                                        notwe       e\r\n            ensorFloalue               consistent                              ,\r\n            T  tf.condtrolallvtf.whiletf.QueueBaseforproducetf.whileproducetf.condan/an/an/acorrespondingcorrespondingcorrespondinglazytf.conddoer\r\n                                                                            remedyv\r\n                                                                            tocallswecausingboolean\r\n                                                                               Hooidemantic.\r\n               when     when                                                plan v s\r\n                                                                             tionsa      lazy\r\n                                                                            we itself.tothisops.\r\n                                                                             func       TF\r\n               possible possible                                             in          Pythons\r\n                b                                                              loop     into\r\n               fectsusedfectsused                                           somethingdonebeforehanddocumentwith\r\n               efis     efis                                                is\r\n                                                                            This   clearlystaged\r\n               side     side                                                            be\r\n                                                                       FeaturesMutationstf.whileconditionandconsistent\r\n            Semanticsmutationmutationeded ed ed     ed  ed ed  ed ed        tem.by      will\r\n                                                                            i            be\r\n                                                                            /    loop    to\r\n               xtraneousxtraneous                                           ute    twice\r\n            Pythoneobjecteobjectpreservpreservpreservpreservn/apreservpreservpreservpreservpreservaluatedthex)\r\n                                                                       Supportedsemantics.v\r\n                                y                                           attribee    xpressions\r\n                        isit-is an                                               aluatefunctione\r\n                         or                                                 thatonlyv\r\n               collec-   e      of                                             ise      and\r\n                        iteratediterated                   e   e  e    AutoGraphsetpreservnot\r\n               nest      -lik                              -lik-lik-lik     aysloopdo    lambda:\r\n               a        or or                       e                  4.   wbody  condition.\r\n               or       e       ollectionchecks                             alwthewethewoperatorsy,\r\n               e e      -lik    c                   -lik               able to\ufb02oof\r\n                         Tensor is                                     T\r\n               -lik-lik         e                          TensorTensorTensor\r\n                         of                                is  is is             reason,aluateensorFlo\r\n                                                       g                           v\r\n            riggers              de      e                                   controleT  arithmetic\r\n            T           Tensor       conditionalf   Tensor                  changedconditionthistoin\r\n                           tf.Datasetclosur         is                    tf.nest.bethe alllambda:\r\n               TensorTensoris     -likto  ertedertedwed ertedgumentgumentgumenttheor\r\n            ersionisof   collectionis     v  v          v  ar  ar ar      by   , F plan\r\n            v   a                         concon allo   con                 mayinsidewe erload\r\n                                     wered          gument                          xceptionssoon.v\r\n            Concondtion iteratednesteratedtf.QueueBaseconditionTensorlonotnotnotarnoteithereithereitheritemloopcalled.eo\r\n                                                                            /doneis\r\n                                                                          recognizedutefuture,tf.cond(x,\r\n                                                                          is          ersion)to\r\n                                          raise                                  loopthecatchingvtypically\r\n                                     return/                              thatattribIncon\r\n                                     /                                      anmutationstf.whileforerted\r\n                                          \ufb01nally                                        objectsv\r\n                                          /                                  itemusingtwice.con\r\n                                                                  conditionalcollectionsettingsupport(withoutis\r\n                                     continue           arithmeticequalitybooleanandtf.whiley\r\n                                     /    xcept         ,  ,   ,          a      e  no\r\n                                          e                               is   loops\r\n                                          /                                  ute   conditionTensor\r\n               if       for     whilebreaktrywithyieldunarybinarybinarybinaryternarybeforyieldand\r\n                                                                             attribwhilenotloopthatx\r\n               w                                                                    currentlyer\r\n                                                                            conditionallyorthenosupportv\r\n               Flo                                                        collection\u201dstagingofistowe\r\n                                                                             xample:    hoxample,\r\n                                                                          \u201cneste staged  e\r\n                                                                             or          or\r\n               Control                              Operators            aAbExample:cFdWhenbeeTherefPlansgNotehF\r\n                                                                                   aluation\r\n                                                                                   v\r\n                                                  15                             wille\r\n                                  AutoGraph: Imperative-style Coding with Graph-based Performance\r\n                                                                        /   / w\r\n                                                        back,                 loitem\r\n                                                          respec-               set\r\n                                                                              , /\r\n                                                        push    back\r\n                                                        list    pop       ,     item\r\n                                           d        list        list            get\r\n                                            op\r\n                                            TF                                  list\r\n                 Semantics                              Tensor          getitemsetitem\r\n                 w                                  Tensor      Tensor            ely\r\n                                                    el  el      el                v\r\n                                                    v   v       v               Tensor\r\n                                                    le      ely                 el\r\n                 ensorFlo                           w   w-lev   w-le            v\r\n                 T   inlinedinlinedn/ainlinedinlinedcorrespondingn/alolotf.TensorArray.writetilon/aTensor.Tensor.tf.TensorArray.readtf.TensorArray.writelerespectin/a\r\n                                   c\r\n                                   functionfunction\r\n                                   unboundunbound                                           (continued)\r\n                 Semanticseded ed  is  is   ed  ed  ed  ed      ed  ed  ed            ed\r\n                                                                                            Features\r\n                 Pythonpreservpreservpreservoutputoutputpreservpreservpreservpreservpreservpreservpreservpreserverted.\r\n                                            ,                                                    v\r\n                                                        et              isis                     con\r\n                     not b                              g\r\n                       func-                int         tar                                 Supportedays\r\n                                            ,                           et                       w\r\n                     andor                              or              g get                    al\r\n                       a)                                       list    tartar\r\n                                                        list                                     are\r\n                     modeAutoGraph          range\r\n                     e     mode             ,                           oror\r\n                     v moduletoe                    empty                                   AutoGraph\r\n                       d   v                len     is                      list            5.\r\n                                            ,           Tensor  Tensor\r\n                                                    listel      el                          able\r\n                 riggersrecursidirectlyrecursi      ;   v       v  e                        T\r\n                 T   in                                                 Tensor\r\n                           in               print       w-le    w-le\r\n                       whitelisteerted          erted   lo      lo  ertedis Tensor    erted      tf.function\r\n                       a passedv                v               is  v       el        v          or\r\n                 ersionertedofisertedertedertederted:   is                  v\r\n                 v   v     v   con v   v    v   con     get     get con get           con          gument.op.added.\r\n                                                          tf.TensorArray    w-le                   ar\r\n                 Con (conparttionconnotconconconfloatnotexperimentaltaristarnottartf.TensorArraylonot TFare\r\n                                                                                                 graph\r\n                                                                                                   \ufb01rst ops\r\n                                                                                               module.toas\r\n                                                                    etc.)                        to     TF\r\n                                                                    ,                          TF\r\n                                                                    set item                   the self\r\n                                                                    ,                            passedescorresponding\r\n                                                                        set                           a\r\n                                   methods                              /                          take\r\n                                                                    dict                              v\r\n                     -de\ufb01ned           methods  e                   (                          includesdirectlythathacorresponding\r\n                                                v   literalsappendpop   item                            as\r\n                                            uilt-in\r\n                     user  lambdaconstructorsinstanceclassbnatilistlistlistotherget\r\n                                                                                                      uilt-ins\r\n                                                                                               whitelistfunctionb\r\n                                                                                                 functionsasupport\r\n                                                                                               the is\r\n                                                                                               , useritPythonadd\r\n                                                                                                 is,is,allto\r\n                     Functions                      Collections                       ComprehensionsaCurrentlybThatcThatdNotePlans\r\n                                                           16\r\n                                  AutoGraph: Imperative-style Coding with Graph-based Performance\r\n                                                                                              class.\r\n                                      fmethods                                                Model\r\n                                             erted                                            the\r\n                                             v\r\n                                       qualifyingcon                                          from\r\n                                             is\r\n                                       all\r\n                  Semantics            with                                                   inherited\r\n                  w                          call\r\n                      wed                    s\r\n                      allo             classerted\r\n                                  d    w v                                                    methods\r\n                  ensorFlonotn/an/an/a necon object\u2019n/an/an/a n/a n/a n/a n/a\r\n                  T                                                                           the\r\n                                                                                              not\r\n                                                                                              s,\r\n                                                                                              subclas\r\n                      a                                                             accessed. the\r\n                      alue                                                                    inerted.\r\n                      v                                                             are         v\r\n                                                                                (continued)     con constructed.\r\n                                                                                              de\ufb01nedis\r\n                      special                                                       ariables    not\r\n                                                                                    v           ut  graph\r\n                  Semanticswith    ed  ed    ed  ed  ed  ed   ed     i   j                      b     them.\r\n                                                                                Features      methodsthe\r\n                                                                                            erted.the tested\r\n                  Pythonrei\ufb01edn/an/apreservpreservpreservpreservpreservpreservpreservn/apartiallypartiallyunde\ufb01nedvwhen\r\n                                                                                            conertsupported,ely\r\n                                                                                Supported     v is    v\r\n                                                                                    when    ayscon\r\n                                                                                            w       staging,xtensi\r\n                                       directly          g                                  alonly  ate\r\n                                                         wed                        xceptionare     fectnot\r\n                                                                                AutoGraphe    will    e\r\n                                       passed            allo                   6.                  efv\r\n                                       is            mode                                           e ha\r\n                                                     e   not                                  class takwe\r\n                                                     v                          ableruntime\r\n                                                                                T   a     .           ut\r\n                                       class             some                                   functools.wrapswillb\r\n                  riggers              ;e            recursi                                  Model   ,\r\n                  T       b   c                      in      h                      raise   tf.functionand\r\n                           wed wed erted     callableertederted;wed   ertederted            orerasersion.\r\n                                   v         is  v       v            v   v         and   TensorK v code,correctly\r\n                  ersion                             erted                                to  a supported.\r\n                  v        alloallocon   AutoGraphconv   con  allosupportedconcon             of  con ork\r\n                  Con      not not not experimentaltoobjectnotconnotnotnotnotnot          aluesgraphnotw\r\n                                                                                    semanticsvtois\r\n                                                 ute                                        tosubclasswithoutgenerated\r\n                                                                                          certaina  the\r\n                                                 attrib                             Python      cachein\r\n                                       types     set                                y       passed    getsource\r\n                                                 /                                  obe       xample,generators,e\r\n                                                         uilt-in  ec                      autoboxe    lik\r\n                                                                  x                               ofinserted\r\n                      unde\ufb01nedglobalnonlocalclassobjectsgetuserb  e   pdb inspect           directlyor\r\n                                                                                    fullysoon.soon.opsFusebe\r\n                                                                                    toititTF          APIs,\r\n                                                                                          y classesthewill\r\n                                                                                    plans       functools.lruw\r\n                                                                  Features            supportsupportmanuserfunctions.allocalls\r\n                      ariables                                    wer               termtotothatis,to inspect\r\n                      V            LiteralsClasses   DecoratorsGeneratorsPo                   User  pdb\r\n                                                                                   aLongbPlanscPlansdNoteeThatfSeegExamples:hPlansiThejSome\r\n                                                           17\r\n", "award": [], "sourceid": 194, "authors": [{"given_name": "Dan", "family_name": "Moldovan", "institution": "Google Inc."}, {"given_name": "James", "family_name": "Decker", "institution": "Purdue University"}, {"given_name": "Fei", "family_name": "Wang", "institution": "Purdue University"}, {"given_name": "Andrew", "family_name": "Johnson", "institution": "Google Inc."}, {"given_name": "Brian", "family_name": "Lee", "institution": "Google Inc."}, {"given_name": "Zachary", "family_name": "Nado", "institution": "Google Brain"}, {"given_name": "D", "family_name": "Sculley", "institution": "Google"}, {"given_name": "Tiark", "family_name": "Rompf", "institution": "Purdue University"}, {"given_name": "Alexander B", "family_name": "Wiltschko", "institution": "Google Inc."}]}