- Inter-Generational Family Reconstitution with Enriched Ontologies by David W. Embley, Stephen W. Liddle, Deryle W. Lonsdale, and Scott N. Woodfield, July 2019. (784KB .pdf)
- Ontological Deep Data Cleaning by Scott N. Woodfield, Spencer Seeger, Samuel Litster, Stephen W. Liddle, Brenden Grace, and David W. Embley, March 2018. (2.1MB .pdf)
- Extraction Rule Creation by Text Snippet Examples by David W. Embley and George Nagy February 2018. (1.1MB .pdf)
- Ontological Document Reading: An Experience Report by David W. Embley, Stephen W. Liddle, Deryle W. Lonsdale, and Scott N. Woodfield, December 2017. (26.2MB .pdf)
- Conceptual Modeling in Accelerating Information Ingest into Family Tree by David W. Embley, Stephen W. Liddle, Tanner S. Eastmond, Deryle W. Lonsdale, Joseph P. Price, and Scott N. Woodfield, June 2017. (2.7MB .pdf)
- GreenFIE: A Green Form-Based Information-Extraction System for Historical Documents by Tae Woo Kim, May 2017. (2.5MB .pdf) (2.2MB .docx)
- GreenFIE: A Green Form-Based Information-Extraction System for Historical Documents by Tae Woo Kim, David W. Embley and Stephen W. Liddle, February 2017. (230KB .docx)
- GreenFIE: A Green Form-Based Information-Extraction System for Historical Documents (extended abstract) by Tae Woo Kim and David W. Embley, February 2017. (216KB .docx)
- Pragmatic Quality Assessment for Automatically Extracted Data by Scott N. Woodfield, Deryle W. Lonsdale, Stephen W. Liddle, Tae Woo Kim, David W. Embley and Christopher Almquist, July 2016. (723KB .pdf)
- Increasing the Quality of Extracted Information by Reading between the Lines by David W. Embley, Stephen W. Liddle and Joseph Park, April 2016. (361KB .pdf)
- Pragmatic Quality Assessment for Automatically Extracted Data by Scott N. Woodfield, Deryle W. Lonsdale, Stephen W. Liddle, Tae Woo Kim, David W. Embley and Christopher Almquist, April 2016. (2.2MB .pdf)
- "Sanity Checks" over Auto-Extracted Family-History Data by Scott N. Woodfield, David W. Embley, Stephen W. Liddle and Christopher Almquist, January 2016. (281KB .pdf)
- "Sanity Checks" over Auto-Extracted Family-History Data (Extended Abstract) by Scott N. Woodfield, David W. Embley, Stephen W. Liddle and Christopher Almquist, January 2016. (281KB .pdf)
- Cost-Effective Information Extraction from Lists in OCRed Historical Documents by Thomas L. Packer and David W. Embley, August 2015. (1.1MB .pdf)
- Increasing the Quality of Extracted Information by Reading between the Lines by Joseph Park, David W. Embley, and Stephen W. Liddle, May 2015. (457KB .pdf)
- Enabling Efficient Chinese Jiapu Information Extraction (Extended Abstract) by Stephen W. Liddle, Derek Dobson, David W. Embley, and Chuck Liu, February 2015. (354KB .pdf)
- FROntIER: A Framework for Extracting and Organizing Biographical Facts in Historical Documents by Joseph Park, January 2015. (3.6MB .pdf)
- HyKSS: Hybrid Keyword and Semantic Search by Andrew J. Zitzelberger, David W. Embley, Stephen W. Liddle, and Del T. Scott, October 2014. (1.4MB .pdf; the final version is available at Springer via 10.1007/s13740-014-0046-4.)
- Unsupervised Training of HMM Structure and Parameters for OCRed List Recognition and Ontology Population by Thomas L. Packer and David W. Embley, October 2014. (1.7MB .pdf)
- Book Project Operations by Peter Lindes August 2014. (146KB .docx)
- Unsupervised Training of HMM Structure and Parameters for OCRed List Recognition and Ontology Population by Thomas L. Packer and David W. Embley, July 2014. (1.4MB .pdf)
- Scalable Recognition, Extraction, and Structuring of Data from Lists in OCRed Text using Unsupervised Active Wrapper Induction by Thomas L. Packer and David W. Embley, June 2014. (1.3MB .pdf)
- A Superstructure for Models of Quality by David W. Embley, Stephen W. Liddle, Scott N. Woodfield, May 2014. (640KB .pdf)
- Multilingual Extraction Ontologies by David W. Embley, Stephen W. Liddle, Deryle W. Lonsdale, Byung-Joo Shin, and Yuri Tijerino, April 2014. (1.2MB .pdf)
- Finding Genealogy Facts with Linguistic Analysis by Peter Lindes, Deryle W. Lonsdale, and David W. Embley, March 2014. (412KB .pdf)
- A Superstructure for Organizing Family History Information by David W. Embley and Scott N. Woodfield, March 2014. (225KB .pdf)
- HyKSS: Hybrid Keyword and Semantic Search by Andrew J. Zitzelberger, David W. Embley, Stephen W. Liddle, and Del T. Scott, February 2014. (1.2MB .pdf)
- Big Data—Conceptual Modeling to the Rescue (Extended Abstract) by David W. Embley and Stephen W. Liddle, July 2013. (1.2MB .pdf)
- Cost Effective Ontology Population with Data from Lists in OCRed Historical Documents by Thomas L. Packer and David W. Embley, June 2013. (497KB .pdf)
- OntoSoar: Using Language to Find Genealogy Facts by Peter Lindes, Deryle Lonsdale, and David W. Embley; Mar. 2013. (176KB .pdf)
- Populating Ontologies with Data from Lists in Family History Books by Thomas L. Packer, David W. Embley; Mar. 2013. (231KB .pdf)
- Extracting and Organizing Facts of Interest from OCRed Historical Documents by Joseph S. Park and David W. Embley, March 2013. (575KB .pdf)
- Populating Ontologies with Data from OCRed Lists by Thomas L. Packer, David W. Embley; (submitted for review at ICDAR) Feb. 2013. (251KB .pdf)
- Cross-Language Hybrid Keyword and Semantic Search by David W. Embley, Stephen W. Liddle, Deryle W. Lonsdale, Joseph Park, Byung-Joo Shin, and Andrew Zitzelberger, March 2012. (939KB .pdf )
- Extracting information from French obituaries by Deryle W. Lonsdale, David W. Embley, Stephen W. Liddle, and Joseph Park, February 2012. (147KB .pdf )
- Lessons Learned in Automatically Detecting Lists in OCRed Historical Documents by Thomas L. Packer, David W. Embley, February 2012. (1.8 MB .pdf)
- HyKSS: Hybrid Keyword and Semantic Search by Andrew Zitzelberger, Master's Thesis, August 2011. (756KB .pdf)
- Performing Information Extraction to Improve OCR Error Detection in Semi-structured Historical Documents by Thomas Packer, August 2011. (1.8MB .pdf)
- Enabling Search for Facts and Implied Facts in Historical Documents by David W. Embley, Stephen W. Liddle, Deryle W. Lonsdale, Spencer Machado, Thomas Packer, Joseph Park, Nathan Tate, and Andrew Zitzelberger, June 2011. (891KB.pdf)
- Principled Pragmatism: A Guide to the Adaptation of Ideas from Philosophical Disciplines to Conceptual Modeling by David W. Embley, Stephen W. Liddle, and Deryle W. Lonsdale, May 2011. (270KB.pdf)
- Generating the Fewest Redundancy-Free XML Scheme Trees from Acyclic Conceptual-Model Hypergraphs in Polynomial Time by Wai Yin Mok, Joseph Fong, and David W. Embley, May 2011. (442KB.pdf)
- A Fact-Oriented, Time-Dependent Formalization of Object-oriented Systems Modeling by Stephen W. Clyde, David W. Embley, Stephen W. Liddle, and Scott N. Woodfield, April 2011. (324KB.pdf)
- Multilingual Ontologies for Cross-Language Information Extraction and Semantic Search by David W. Embley, Stephen W. Liddle, Deryle W. Lonsdale, and Yuri Tijerino, April 2011. (555KB.pdf)
- Data Extraction from Web Tables: the Devil is in the Details by George Nagy, Sharad Seth, Dongpu Jin, David W. Embley, Spencer Machado, and Mukkai Krishnamoorthy, March 2011. (115KB.doc)
- Factoring Web Tables by David W. Embley, Mukkai Krishnamoorthy, George Nagy, and Sharad Seth, February 2011. (131KB.pdf)
- Preface for: The Handbook of Conceptual Modeling: Theory, Practice, and Research Challenges edited by David W. Embley and Bernhard Thalheim, August 2010. (91KB.pdf)
- Conceptual-Model Programming: A Manifesto by David W. Embley, Stephen W. Liddle, and Oscar Pastor, July 2010. (703KB.pdf)
- Conceptual Modeling Foundations for a Web of Knowledge by David W. Embley, Stephen W. Liddle, and Deryle Lonsdale, July 2010. (1.7MB.pdf)
- Model-Driven Software Development by Stephen W. Liddle, June 2010. (456KB pdf)
- Conceptual Modeling for a Web of Knowledge by David W. Embley, Stephen W. Liddle, and Cui Tao, June 2010. (1.6MB.pdf)
- Automating Extraction From and Reasoning About Genealogical Records: A Prototype, by Charla Woodbury, Master's Thesis, June 2010. (854KB.pdf)
- Extracting Person Names from Diverse and Noisy OCR Text by Thomas Packer, Joshua Lutes, Aaron Stewart, David W. Embley, Eric Ringger, Kevin Seppi, and Lee Jensen, May 2010. (715KB.pdf)
- Mapping Conceptual Models to Database Schemas by David W. Embley and Wai Yin Mok, March 2010. (354KB.pdf)
- Ontologies for Multilingual Extraction by Deryle W. Lonsdale, David W. Embley, and Stephen W. Liddle, March 2010. (91KB.pdf)
- Extracting Person Names from Diverse and Noisy OCR Text, Exteneded Abstract by Thomas Packer, Joshua Lutes, Aaron Stewart, David Embley, Eric Ringger, Kevin Seppi, and Lee Jensen, April 2010. (54KB.docx)
- Extracting Names Using Layout Clues: An Initial Report by Aaron P. Stewart and David W. Embley, March 2010. (451KB.pdf)
- Automatic Extraction from and Reasoning about Genealogical Records: A Prototype by Charla J. Woodbury, David W. Embley, and Stephen W. Liddle, March 2010. (259KB.docx)
- Ontologies for Multilingual Extraction by Deryle W. Lonsdale, David W. Embley, and Stephen W. Liddle, February 2010. (81KB.pdf)
- Extracting Person Names from Diverse and Noisy OCR Text by Thomas Packer, Joshua Lutes, Aaron Stewart, David W. Embley, Eric Ringger, and Kevin Seppi, January 2010. (646KB.pdf)
- Extracting a largest Redundancy-Free XML Storage Structure from anAcyclic Hypergraph in Polynomial Time by Wai Yin Mok, Joseph Fong, and David W. Embley, December 2009. (374KB.pdf)
- Data Frame Augmentation of Free Form Queries for Constraint Based Document Filtering by Andrew Zitzelberger, December 2009. (301KB.pdf)
- Theoretical Foundations for Enabling a Web of Knowledge by David W. Embley and Andrew Zitzelberger, August 2009. (634KB.pdf)
- KBB: A Knowledge-Bundle Builder for Research Studies by David W. Embley, Stephen W. Liddle, Deryle W. Lonsdale, Aaron Stewart, and Cui Tao, May 2009. (252KB.pdf)
- FOCIH: Form-based Ontology Creation and Information Harvesting by Cui Tao, David W. Embley, and Stephen W. Liddle, April 2009. (1.6MB.pdf)
- Conceptual Modeling for a Web of Knowledge by David W. Embley, Stephen W. Liddle, and Cui Tao, March 2009. (1.1MB.pdf)
- Domain-Independent Data Extraction: Person Names by Carl Christensen and Deryle Lonsdale, March 2009. (436KB.doc)
- A Model of World Wide Web Evolution by Yihong Ding, Li Xu, and David W. Embley March 2009. (81K.doc)
- Enabling a Web of Knowledge by Cui Tao, David W. Embley, and Stephen W. Liddle, January 2009. (512K.pdf)
- Ontology Generation, Information Harvesting and Semantic Annotation For Machine-Generated Web Pages, PhD Dissertation by Cui Tao, December 2008. (2.36MB.pdf)
- Extracting a Largest Redundancy-Free XML Storage Structure from an Acyclic Hypergraph in Polynomial Time by Wai Yin Mok, Joseph Fong, and David W. Embley, (revised manuscript), November 2008. (159K.pdf)
- Categorization of Web Documents using Extraction Ontologies by Li Xu and David W. Embley, International Journal of Metadata, Semantics and Ontologies, Vol. 3, No. 1, 2008. (762K.pdf)
- A Conceptual-Model-Based Computational Alembic for a Web of Knowledge by D.W. Embley, S.W. Liddle, D. Lonsdale, G. Nagy, Y. Tijerino, R. Clawson, J. Crabtree, Y. Ding, P. Jha, Z. Lian, S. Lynn, R.K. Padmanabhan, J. Peters, C. Tao, R. Watts, C. Woodbury, and A. Zitzelberger, ER2008, October 2008. (331K.pdf)
- Automatic Hidden-Web Table Interpretation, Conceptualization, and Semantic Annotation by Cui Tao and David W. Embley, (revised manuscript), October 2008. (159K.pdf)
- Semantically Conceptualizing and Annotating Tables by Stephen Lynn and David W. Embley, Proceedings of the 3rd Asian Semantic Web Conference (ASWC2008) (submitted manuscript), July 2008. (159K.pdf)
- Foundational Data Modeling and Schema Transformations for XML Data Engineering Proceedings of the 2nd International United Information Systems Conferences (UNISCON'08 (invited paper) by R. Al-Kamha, D.W. Embley, and S.W. Liddle, April 2008. (165K.pdf)
- Automating Mini-Ontology Generation from Canonical Tables, by Stephen Lynn, Master's Thesis, April 2008. (1.05MB .pdf)
- A Tool to Support Ontology Creation Based on Incremental Mini-Ontology Merging, by Zonghui Lian, Master's Thesis, March 2008. (1.06MB .pdf)
- Automatic Generation of Ontologies from Canonicalized Web Tables, by Stephen Lynn and David W. Embley, (submitted manuscript), March 2008. (256K.pdf)
- Reusing Ontologies and Language Components for Ontology Generation, Data & Knowledge Engineering (submitted manuscript) by Deryle W. Lonsdale, David W. Embley, Yihong Ding, Li Xu, and Martin Hepp, March 2008. (275K.pdf)
- Pattern Markup Language: A Pattern-Based Tool for Quickly Automating Genealogy Data Extraction Proceedings of the 8th Annual Family History Technology Workshop by Jonathan Baker, Hilton Campbell, Jordan Crabtree, and David W. Embley, March 2008. (359K.pdf)
- Automatic Hidden-Web Table Interpretation, Conceptualization, and Semantic Annotation, by Cui Tao and David W. Embley Data & Knowledge Engineering (submitted manuscript), January 2008. (902KB.pdf)
- Automatic Hidden-Web Table Interpretation by Sibling Page Comparison, by Cui Tao and David W. Embley, ER'07 November 2007, (.pdf)
- Ontology Aware Software Service Agents: Meeting Ordinary User Needs on the Semantic Web, PhD Dissertation by Muhammed J. Al-Muhammed, July 2007. (849KB.pdf)
- Generating Ontologies via Language Components and Ontology Reuse Proceedings of the 12th International Conference on Applications of Natural Language to Information Systems (NLDB'07) by Yihong Ding, Deryle Lonsdale, David W. Embley, Martin Hepp, and Li Xu, June 2007. (157K.pdf)
- Conceptual XML for Systems Analysis by Reema Al-Kamha, PhD Dissertation, June 2007. (1.56MB.pdf)
- Interactive Wang Notation Tool for Web Tables, by Piyushee Jha, May 2007. (385KB.pdf)
- Seed-based Generation of Personalized Bio-Ontologies for Information Extraction, by Cui Tao and David W. Embley, CMLSA 2007 (submitted manuscript), May 2007. (385KB.pdf)
- Enriching OWL with Instance Recognition Semantics for Automated Semantic Annotation by Yihong Ding, David W. Embley, and Stephen W. Liddle, ONISW 2007 (submitted manuscript), April 2007. (351KB.pdf)
- Bringing Web Principles to Services: Ontology-Based Web Services, by Muhammed J. Al-Muhammed, David W. Embley, Stephen W. Liddle, and Yuri A. Tijerino, SWSP 2007, April 2007. (80KB.pdf)
- Augmenting Traditional Conceptual Models to Accommodate XML Structural Constructs, by Reema Al-Kamha, David W. Embley, and Stephen W. Liddle, ER'07 (submitted manuscript), April 2007. (832KB)
- A Composite Approach to Automating Direct and indirect Schema Mappings, by Li Xu and David W. Embley, Information Systems (submitted version), December 2006. (421KB.pdf)
- Formulating Queries for Assessing Clinical Trial Eligibility (extended journal version), by Deryle Lonsdale, Clint Tustison, Craig Parker, and David W. Embley, Data & Knowledge Engineering (submitted version), October 2006. (142KB.pdf)
- Toward Making Online Biological Data Machine Understandable, by Cui Tao, The 5th International Semantic Web Conference (ISWC 2006) Doctoral Consortium, Athens, GA, November 2006. (72KB)
- HTML Table Interpretation by Sibling Page Comparison in the Molecular Biology Domain , by Cui Tao and David W. Embley, 3rd Biotechnology and Bioinformatics Symposium (BIOT 2006), Provo, UT, October, 2006. (298KB)
- Automatic Creation of Web Services from Extraction Ontologies, by Cui Tao, Yihong Ding, and Deryle Lonsdale, First International Workshop on Semantic Web Applications: Theory and Practice (SWAT 2006) in conjunction with ER 2006, Tucson, Arizona, November 2006. (243KB)
- Avoiding Deceptive Annotation in the Semantic Web, by Yihong Ding, Ying Ding, David W. Embley, and Omair Shafiq, in Proceedings of the First Semantic Authoring and Annotation Workshop (SAAW 2006) in conjunction with ISWC 2006, Athens, Georgia, November 2006. (143KB.pdf)
- Ontology-Based Constraint Recognition for Free-Form Service Requests, by Muhammed J. Al-Muhammed and David W. Embley, ICDE 2007 (submitted manuscript), July 2006. (134KB.pdf)
- Ontology-Based Free-Form Query Processing for the Semantic Web, by Mark Vickers, Master's Thesis, June 2006. (2.2MB .pdf)
- OWL-AA: Enriching OWL with Instance Recognition Semantics for Automated Semantic Annotation, by Yihong Ding, David W. Embley, and Stephen W. Liddle, Technical Report, April 2006. (77KB)
- Combining Declarative and Procedural Knowledge to Automate and Represent Ontology Mapping, by Li Xu, David W. Embley, and Yihong Ding, SWAT 2006 (submitted manuscript), April 2006. (131KB)
- Automatic Creation and Simplified Querying of Semantic Web Content: An Approach Based on Information-Extraction Ontologies, by Yihong Ding, David W. Embley, and Stephen W. Liddle, ASWC'06 (submitted manuscript), March 2006. (526KB)
- Using Data-Extraction Ontologies to Foster Automating Semantic Annotation, by Yihong Ding and David W. Embley, ICDE 2006 PhD Workshop, April 2006. (submitted manuscript) (49KB .pdf)
- Formulating Queries for Assessing Clinical Trial Eligibility, by Deryle Lonsdale, Clint Tustison, Craig Parker, and David W. Embley, NLDB'06 (submitted manuscript), January 2006. (111KB.pdf)
- Study of Design Issues on an Automated Semantic Annotation System, by Yihong Ding, AIS SIGSEMIS Bulletin, Vol. 2, No. 3-4, July-December 2005. (144KB .pdf)
- Generating Data-Extraction Ontologies By Example, by Yuanqiu (Joe) Zhou, Master's Thesis, December 2005. (1.0MB .pdf)
- Generating Medical Logic Modules for Clinical Trial Eligibility, by Craig Parker, Master's Thesis, November 2005. (920KB .pdf)
- Resolving Underconstrained and Overconstrained Systems of Conjunctive Constraints for Service Requests, by Muhammed J. Al-Muhammed and David W. Embley, CAiSE2006 (submitted manuscript), November 2005. (190KB.pdf)
- Filtering Web Pages with Application Ontologies, by Li Xu and David W. Embley, (submitted manuscript), October 2005. (2.3MB.pdf)
- Representing Generalization/Specialization in XML Schema by Reema Al-Kamha, David W. Embley, and Stephen W. Liddle, EMISA Workshop (submitted manuscript), August 2005. (133KB .pdf)
- Automatic Creation and Simplified Querying of Semantic Web Content, by David W. Embley, Yihong Ding, Stephen W. Liddle, and Mark Vickers, Technical Report, April 2005. (417KB .pdf)
- Conceptual Model Based Semantic Web Services, by Muhammed Al-Muhammed, David W. Embley, and Stephen W. Liddle, ER2005 (submitted manuscript), April 2005. (219KB .pdf)
- A Generalized Framework for an Ontology-Based Data-Extraction System, by Alan Wessman, Stephen W. Liddle, and David W. Embley, ISTA 2005 (submitted manuscript), March 2005. (105KB .pdf)
- Toward Tomorrow's Semantic Web -- An Approach Based on Information Extraction Ontologies, by David W. Embley, Position Paper for Dagstuhl Seminar, January 2005. (84KB .pdf)
- A Framework for Extraction Plans and Heuristics in an Ontology-Based Data-Extraction System, by Alan Wessman, Master's Thesis, December 2004. (1.58MB .doc)
- Retrieving Danish Genealogical Records on the Semantic Web by Charla Woodbury, Technical Report, December 2004. (1196KB .doc)
- Querying Disjunctive Databases in Polynomial Time by Lars Olson and David W. Embley, (submitted manuscript), December 2004. (175KB .pdf)
- Automating the Extraction of domain-Specific Information from the Web -- A Case Study for the Genealogical Domain, by Troy Walker Master's Thesis, November 2004. (972KB .pdf)
- Logical Form Identification for Medical Clinical Trials, by Clint A. Tustison, Master's Thesis, August 2004. (275KB .pdf)
- A Composite Approach to Automating Direct and Indirect Schema Mappings, by Li Xu and David W. Embley, (submitted manuscript), July 2004. (419KB .pdf)
- Grouping Search-Engine Returned Citations for Person-Name Queries by Reema Al-Kamha, and David W. Embley, ACM 6th International Workshop on Web Information and Data Management (WIDM 2004), June 2004. (505KB .pdf)
- Grouping Search-Engine Returned Citations for Person-Name Queries by Reema Al-Kamha, Masters Thesis, June 2004. (1.27M.pdf)
- Automatic Location and Separation of Records: A Case Study in the Genealogical Domain, by Troy Walker and David W. Embley, CoMWIM04 Workshop (submitted manuscript), May 2004. (2.788KB .pdf)
- Query Rewriting for Extracting Data behind HTML Forms by Xueqi (Helen) Chen, David W. Embley, and Stephen W. Liddle, CoMWIM04 Workshop (submitted manuscript), May 2004. (2.4MB .doc)
- Dynamic Matchmaking Between Messages and Services in Multi-Agent Systems by Muhammed Al-Muhammed, Masters Thesis, May 2004. (3.95MB.doc)
- Toward a Flexible Human-Agent Collaboration Framework with Mediating Domain Ontologies for the Semantic Web by Yuri A. Tijerino, and Muhammed Al-Muhammed, submitted, April 2004. (223KB .pdf)
- Automating the Extraction of Data from HTML Tables with Unknown Structure by David W. Embley, Cui Tao, and Stephen W. Liddle, Data & Knowledge Engineering, (submitted manuscript), May 2005. (663KB .pdf)
- Towards Ontology Generation from Tables by Yuri A. Tijerino, David W. Embley, Deryle W. Lonsdale, Yihong Ding, and George Nagy, World Wide Web Journal, Vol. 8, No. 3, September 2005. pp.261-285. (365KB .pdf)
- Towards Enabling Communication among Independent Agents in the Semantic Web by Muhammed Al-Muhammed and David W. Embley, submitted, April 2004. (233KB .pdf)
- Automatic Direct and Indirect Schema Mapping: Experiences and Lessons Learned by David W. Embley, Li Xu, and Yihong Ding, SIGMOD Record, Vol. 33, No. 4, December 2004. pp.14-19. (239KB .pdf)
- Enterprise Modeling with Conceptual XML by David W. Embley, Stephen W. Liddle, and Reema Al-Kamha, ER 2004 (submitted manuscript), April 2004. (998KB .pdf)
- Query Rewriting for Extracting Data Behind HTML Forms by Xueqi Chen, Masters Thesis, March 2004. (1.4M .pdf)
- Automating the Extraction of Genealogical Information from the Web by Troy Walker & David W. Embley Fourth Anual Workshop on Technology for Family History and Genealogical Research, March 2004. (880K .pdf)
- Toward Semantic Understanding -- An Approach Based on Information Extraction Ontologies by David W. Embley, The Fifteenth Australasian Database Conference, January 2004. (246K .pdf)
- Information Extraction and Integration from Heterogeneous Biological Data Sources by Cui Tao, December. 2003. (117K .pdf)
- Automating Schema Mapping for Data Integration by Li Xu and David W. Embley, submitted, August 2003. (546K .pdf)
- Schema Matching and Data Extraction over HTML Tables by Cui Tao, Masters Thesis, Septemper 2003. (1.4M .pdf)
- Querying Disjunctive Databases in Polynomial Time by Lars Olson, Masters Thesis, August 2003. (1.3M .pdf)
- Source Discovery and Schema Mapping for Data Integration by Li Xu, PhD Dissertation, July 2003. (28.7M .ps)
- Semiautomatic Generation of Resilient Data-Extraction Ontologies by Yihong Ding, Masters Thesis, June 2003. (1.8M .ps)
- Dynamic Matchmaking Between Messages and Services in Multi-Agent Information Systems by Muhammed Al-Muhammed and David W. Embley, International Workshop on Agent-Oriented Information Systems, October 2003. (60K .doc)
- Automating the Extraction of Data from HTML Tables with Unknown Structure by David W. Embley, Cui Tao, and Stephen W. Liddle, submitted, May 2003. (703K .pdf)
- An Integrated Ontology Development Environment for Data Extraction by Stephen W. Liddle, Kimball A. Hewett, and David W. Embley, ISTA2003, June 2003. (published version, 13 pages, 94K .pdf; manuscript as submitted, 15 pages, 178K .pdf)
- Using Schema Mapping to Facilitate Data Integration by Li Xu and David W. Embley, April 2003. (258K .pdf)
- Results of Using an Efficient Algorithm to Query Disjunctive Genealogical Data by L.E. Olson and D.W. Embley, Proceedings of the Third Annual Workshop on Technology for Family History and Genealogical Research Brigham Young University, April 2003. (40K .pdf)
- Ontology-Based Extraction of RDF Data from the World Wide Web by Tim Chartrand, Masters Thesis, March 2003. (1.2M .pdf)
- Combining the Best of Global-as-View and Local-as-View for Data Integration by Li Xu and D.W. Embley, November 2002. (202K .pdf)
- Discovering Direct and Indirect Matches for Schema Elements by Li Xu and D.W. Embley, The 8th International Conference on Database Systems for Advanced Applications (DASFAA'03). (202K .pdf)
- Attribute Match Discovery in Information Integration: Exploiting Multiple Facets of Metadata by D.W. Embley, David Jackman, and Li Xu, Journal of the Brazilian Computing Society. (390K .ps)
- Extracting Information from Heterogeneous Information Sources Using Ontologically Specified Target Views, by J. Biskup and D.W. Embley, Information Systems, Volume 28, Number 3, 2003, 169-212. (438K .pdf, 958K .ps)
- Performing Binary-Categorization on Multiple-Record Web Documents Using Information Retrieval Models and Application Ontologies, Ontologically Specified Target Views, by L.W. Kwong and Y.-K. Ng, World Wide Web, Volume 6, Number 3, 2003, 281-303. (8.2MB .ps)
- Recognizing Records from the Extracted Cells of Microfilm Tables by K.M. Tubbs and D.W. Embley Proceedings of the Symposium on Document Engineering 2002. McLean, Virginia, November 2002. (860K .pdf)
- Extracting Data Behind Web Forms by S.W. Liddle, D.W. Embley, D.T. Scott, and S.H. Yau, Proceedings of the Workshop on Conceptual Modeling Approaches for e-Business, Tampere, Finland, October, 2002. (245K .pdf)
- Automatically Extracting Ontologically Specified Data from HTML Tables with Unknown Structure by D.W. Embley, C. Tao, and S.W. Liddle Proceedings of the 21st International Conference on Conceptual Modeling, Tampere, Finland, October, 2002. (422K .pdf)
- Representing and Querying Semistructured Web Data Using Nested Tables with Structural Variants, by I.M.E. Filha, A.S. da Silva, A.H.F. Laender, and D.W. Embley, Proceedings of the 21st International Conference on Conceptual Modeling, Tampere, Finland, October, 2002. (210meg .pdf)
- Peppering Knowledge Sources with SALT; Boosting Conceptual Content for Ontology Generation by D. Lonsdale, Y. Ding, D.W. Embley, and A. Melby, Proceedings of the AAAI Workshop on Semantic Web Meets Language Resources, Edmonton, Alberta, Canada, July 2002. (603K .pdf)
- Mapping Target Schemas to Source Schemas Using WordNet Hierarchies and Structure Context by David Jackman, Masters Thesis, June 2002. (4.7M .doc)
- Using Nested Tables for Representing and Querying Semistructured Web Data, by I.M.E. Filha, A.S. da Silva, A.H.F. Laender, and D.W. Embley, The Fourteenth International Conference on Advanced Information Systems Engineering Toronto, Canada, 27-31 May 2002, (210meg .pdf)
- Efficiently Querying contradictory and Uncertain Genealogical Data by L.E. Olson and D.W. Embley, Proceedings of the Second Annual Workshop on Technology for Family History and Genealogical Research Brigham Young University, April 2002. (77K .pdf)
- Recognizing Records from the Extracted Cells of Genealogical Microfilm Tables by Kenneth M. Tubbs, Masters Thesis, December 2001. (2.3M .pdf)
- Automating the Extraction of Data Behind Web Forms by Sai Ho (Tony) Yau. Masters Thesis, December 2001. (1.4M Word)
- A Binary-Categorization Approach for Classifying Multiple-Record Web Documents Using a Probabilistic Model Retrieval Model by Quan Wang, Masters Thesis. (16.8M .ps)
- A Probabilistic Model for Binary Categorization of Multiple-Record Web Documents by June Tang, Masters Thesis. (793K .ps)
- On the Automatic Extraction of Data from the Hidden Web by S.W. Liddle, S.H. Yau, and D.W. Embley, Proceedings of the International Workshop on Data Semantics in Web Information Systems (DASWIS-2001), Yokohama, Japan, 27-30 November 2001. (181K .pdf)
- Recognizing Ontology-Applicable Multiple-Record Web Documents, by D.W. Embley, Y.-K. Ng, and L. Xu, Proceedings of the 20th International Conference on Conceptual Modeling (er2001), Yokohama, Japan, 27-30 November 2001. (2.1meg .pdf, 7.0meg .ps)
- Multifaceted Exploitation of Metadata for Attribute Match Discovery in Information Integration, by D.W. Embley, D. Jackman, and L. Xu, Proceedings of WIIW01, Rio de Janeiro, Brazil, 9-11 April 2001. (108K .pdf, 247K .ps)
- Locating and Reconfiguring Records in Unstructured Multiple-Record Web Documents, by D.W. Embley and L. Xu, LNCS 1997 (787K .ps)
- Mediated Information Gain, by J. Biskup and D.W. Embley, International Database Engineering and Application Symposium, (179K .ps)
- An Integrated Ontology Development Environment for Data Extraction, by Kimball A. Hewett, Master's Thesis, April 2000. (3M .pdf)
- Record Location and Reconfiguration in Unstructured Multiple-Record Web Documents, by D.W. Embley and L. Xu, WebDB'00 Proceedings (603K .ps)
- Demonstration: A Robust Web Data-Extraction Technique With High Recall and Precision, DEG Technical Report, (202K .pdf, 1,803K .ps)
- Conceptual-Model-Based Data Extraction from Multiple-Record Web Documents, by D.W. Embley, E.M. Campbell, Y.S. Jiang, S.W. Liddle, D.W. Lonsdale, Y.-K. Ng, and R.D. Smith, Data & Knowledge Engineering, November 1999 (227K .pdf, 425K .ps)
- Automatically Extracting Structure and Data from Business Reports, by S.W. Liddle, D.M. Campbell, and C. Crawford, CIKM'99 Proceedings (186K .pdf, 385K .ps)
- Ontology Suitability for Uncertain Extraction of Information from Multi-Record Web Documents, D.W. Embley, N. Fuhr, C.-P. Klas, and T. Roelleke, ADI'99 Proceedings (111K .pdf, 255K .ps)
- Record-Boundary Discovery in Web Documents, by D.W. Embley, Y.S. Jiang, and Y.-K. Ng, SIGMOD'99 Proceedings (248K .pdf, 313K .ps)
- Record-Boundary Discovery in Web Documents, Masters Thesis by Yuan Jiang, (435K .pdf)
- A Conceptual-Modeling Approach to Extracting Data from the Web, by D.W. Embley, D.M. Campbell, Y.S. Jiang, Y.-K. Ng, R.D. Smith, S.W. Liddle, and D.W. Quass, ER'98 Proceedings (173K .pdf, 394K .ps)
- Ontology-Based Extraction and Structuring of Information from Data-Rich Unstructured Documents, by D.W. Embley, D.M. Campbell, and R.D. Smith, CIKM'98 Proceedings (170K .pdf, 323K .ps)
- Cardinality Constraints in Semantic Data Models, by S.W. Liddle, D.W. Embley, and S.N. Woodfield, Data & Knowledge Engineering, 1993, (7.8MB .pdf)
- A Scheme-Driven Natural Language Query Translator, by D.W. Embley and R.E. Kimbrell, 1985 ACM Computer Science Conference Proceedings. (654KB .pdf)
- Programming with Data Frames for Everyday Data Items by D.W. Embley, AFIPS'80 Proceedings. (4.5M .pdf)