Skip to main content Accessibility help
×
Hostname: page-component-8448b6f56d-m8qmq Total loading time: 0 Render date: 2024-04-23T11:07:33.059Z Has data issue: false hasContentIssue false

References

Published online by Cambridge University Press:  24 January 2020

Qiang Yang
Affiliation:
Hong Kong University of Science and Technology
Yu Zhang
Affiliation:
Hong Kong University of Science and Technology
Wenyuan Dai
Affiliation:
4Paradigm Co., Ltd.
Sinno Jialin Pan
Affiliation:
Nanyang Technological University, Singapore
Get access
Type
Chapter
Information
Transfer Learning , pp. 336 - 376
Publisher: Cambridge University Press
Print publication year: 2020

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

1000 Genomes Project Consortium. 2015. A global reference for human genetic variation. Nature, 526(7571), 6874.CrossRefGoogle Scholar
Aas, Kjersti. 2001. Microarray Data Mining: A Survey. Tech. report Norwegian Computing Center.Google Scholar
Abadi, Martín, Barham, Paul, Chen, Jianmin, et al. 2016a. TensorFlow: A system for large-scale machine learning. Pages 265283 of: Keeton, Kimberly, and Roscoe, Timothy (eds.), Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation.Google Scholar
Abadi, Martín, Chu, Andy, Goodfellow, Ian J., et al. 2016b. Deep learning with differential privacy. Pages 308–318 of: Proceedings of ACM Conference on Computer and Communications Security.Google Scholar
Abu-El-Haija, Sami, Kothari, Nisarg, Lee, et al. 2016. Youtube-8M: A large-scale video classification benchmark. arXiv preprint, arXiv:1609.08675.Google Scholar
Acharya, Ayan, Mooney, Raymond J., and Ghosh, Joydeep. 2014. Active multitask learning using both latent and supervised shared topics. Pages 190–198 of: Proceedings of the 2014 SIAM International Conference on Data Mining.Google Scholar
Amaldi, Edoardo, and Kann, Viggo. 1998. On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems. Theoretical Computer Science, 209(1), 237260.Google Scholar
Ando, Rie Kubota, and Zhang, Tong. 2005. A framework for learning predictive structures from multiple tasks and unlabeled data. Journal of Machine Learning Research, 6, 18171853.Google Scholar
Antony, Joseph, McGuinness, Kevin, O’Connor, Noel E., and Moran, Kieran. 2016. Quantifying radiographic knee osteoarthritis severity using deep convolutional neural networks. Pages 1195–1200 of: 23rd International Conference on Pattern Recognition.Google Scholar
Argyriou, Andreas, Evgeniou, Theodoros, and Pontil, Massimiliano. 2006. Multi-task feature learning. Pages 41–48 of: Advances in Neural Information Processing Systems.CrossRefGoogle Scholar
Argyriou, Andreas, Evgeniou, Theodoros, and Pontil, Massimiliano. 2008. Convex multi-task feature learning. Machine Learning, 73(3), 243272.Google Scholar
Argyriou, Andreas, Micchelli, Charles A., and Pontil, Massimiliano. 2009. When is there a representer theorem? Vector versus matrix regularizers. Journal of Machine Learning Research, 10, 25072529.Google Scholar
Argyriou, Andreas, Micchelli, Charles A., and Pontil, Massimiliano. 2010. On spectral learning. Journal of Machine Learning Research, 11, 935953.Google Scholar
Arık, Sercan Ö., Chrzanowski, Mike, Coates, Adam, et al. 2017. Deep voice: Real-time neural text-to-speech. Pages 195–204 of: Proceedings of International Conference on Machine Learning.Google Scholar
Arjovsky, Martín, and Bottou, Léon. 2017. Towards principled methods for training generative adversarial networks. CoRR, abs/1701.04862.Google Scholar
Arjovsky, Martín, Chintala, Soumith, and Bottou, Léon. 2017. Wasserstein generative adversarial networks. Pages 214–223 of: Proceedings of the 34th International Conference on Machine Learning.Google Scholar
Ashley, Kevin D. 1991. Reasoning with cases and hypotheticals in HYPO. International Journal of Man-Machine Studies, 34(6), 753796.Google Scholar
Augenstein, Isabelle, and Søgaard, Anders. 2017. Multi-task learning of keyphrase boundary classification. Pages 341–346 of: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.Google Scholar
Aytar, Yusuf, and Zisserman, Andrew. 2011. Tabula rasa: Model transfer for object category detection. Pages 2252–2259 of: Proceedings of IEEE International Conference on Computer Vision.Google Scholar
Azar, Mohammad Gheshlaghi, Lazaric, Alessandro, and Brunskill, Emma. 2013. Sequential transfer in multi-armed bandit with finite set of models. Pages 2220–2228 of: Advances in Neural Information Processing Systems.Google Scholar
Bächlin, Marc, Roggen, Daniel, Tröster, Gerhard, et al. 2009. Potentials of enhanced context awareness in wearable assistants for Parkinson’s disease patients with the freezing of gait syndrome. Pages 123–130 of: Proceedings of the 13th IEEE International Symposium on Wearable Computers.Google Scholar
Bahdanau, Dzmitry, Cho, Kyunghyun, and Bengio, Yoshua. 2014. Neural machine translation by jointly learning to align and translate. CoRR, abs/1409.0473.Google Scholar
Bakas, Spyridon, Akbari, Hamed, Sotiras, Aristeidis, et al. 2017. Advancing the cancer genome atlas glioma MRI collections with expert segmentation labels and radiomic features. Scientific Data, 4, 170117.Google Scholar
Bakker, Bart, and Heskes, Tom. 2003. Task clustering and gating for Bayesian multitask learning. Journal of Machine Learning Research, 4, 8399.Google Scholar
Baktashmotlagh, Mahsa, Harandi, Mehrtash T., Lovell, Brian C., and Salzmann, Mathieu. 2013. Unsupervised domain adaptation by domain invariant projection. Pages 769–776 of: Proceedings of IEEE International Conference on Computer Vision.CrossRefGoogle Scholar
Baktashmotlagh, Mahsa, Harandi, Mehrtash T., Lovell, Brian C., and Salzmann, Mathieu. 2014. Domain adaptation on the statistical manifold. Pages 2481–2488 of: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.CrossRefGoogle Scholar
Balcan, Maria-Florina, Blum, Avrim, and Vempala, Santosh. 2015. Efficient representations for lifelong learning and autoencoding. Pages 191–210 of: Proceedings of the 28th Conference on Learning Theory.Google Scholar
Balikas, Georgios, Moura, Simon, and Amini, Massih-Reza. 2017. Multitask learning for fine-grained Twitter sentiment analysis. Pages 1005–1008 of: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval.CrossRefGoogle Scholar
Bao, Ling, and Intille, Stephen S. 2004. Activity recognition from user-annotated acceleration data. Pages 1–17 of: Proceedings of the Second International Conference on Pervasive Computing.Google Scholar
Barreto, André, Dabney, Will, Munos, Rémi, et al. 2017. Successor features for transfer in reinforcement learning. Pages 4058–4068 of: Advances in Neural Information Processing Systems.Google Scholar
Bartlett, Peter L., and Mendelson, Shahar. 2002. Rademacher and Gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research, 3, 463482.Google Scholar
Barzilai, Aviad, and Crammer, Koby. 2015. Convex multi-task learning by clustering. Pages 65–73 of: Proceedings of the 18th International Conference on Artificial Intelligence and Statistics.Google Scholar
Bassily, Raef, Smith, Adam D., and Thakurta, Abhradeep. 2014. Private empirical risk minimization: Efficient algorithms and tight error bounds. Pages 464–473 of: Proceedings of IEEE Annual Symposium on Foundations of Computer Science.Google Scholar
Baxter, Jonathan. 2000. A model of inductive bias learning. Journal of Artifical Intelligence Research, 12, 149198.Google Scholar
Bay, Herbert, Ess, Andreas, Tuytelaars, Tinne, and Gool, Van, Luc, . 2008. Speeded-up robust features (SURF). Computer Vision and Image Understanding, 110(3), 346359.Google Scholar
Bello, Irwan, Zoph, Barret, Vasudevan, Vijay, and Le, Quoc V. 2017. Neural optimizer search with reinforcement learning. Pages 459–468 of: Proceedings of the 34th International Conference on Machine Learning.Google Scholar
Belmont, John M., Butterfield, Earl C., and Ferretti, Ralph P. 1982. To secure transfer of training instruct self-management skills. Pages 147–154 of: Detterman, Douglas K., and Sternberg, Robert J. (eds.), How and How Much Can Intelligence Be Increased. Ablex Publishing Corporation.Google Scholar
Ben-David, Shai, and Borbely, Reba Schuller. 2008. A notion of task relatedness yielding provable multiple-task learning guarantees. Machine Learning, 73(3), 273287.Google Scholar
Ben-David, Shai, and Schuller, Reba. 2003. Exploiting task relatedness for multiple task learning. Pages 567–580 of: Proceedings of the 16th Annual Conference on Computational Learning Theory.Google Scholar
Ben-David, Shai, Gehrke, Johannes, and Schuller, Reba. 2002. A theoretical framework for learning from a pool of disparate data sources. Pages 443–449 of: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Google Scholar
Ben-David, Shai, Blitzer, John, Crammer, Koby, and Pereira, Fernando. 2006. Analysis of representations for domain adaptation. Pages 137–144 of: Advances in Neural Information Processing Systems.Google Scholar
Ben-David, Shai, Blitzer, John, Crammer, et al. 2010. A theory of learning from different domains. Machine Learning, 79(1–2), 151175.Google Scholar
Bengio, Yoshua. 2009. Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2(1), 1127.Google Scholar
Bengio, Yoshua. 2012. Deep learning of representations for unsupervised and transfer learning. Pages 17–36 of: Proceedings of ICML Workshop on Unsupervised and Transfer Learning.Google Scholar
Bengio, Yoshua, Lamblin, Pascal, Popovici, Dan, and Larochelle, Hugo. 2007. Greedy layer-wise training of deep networks. Pages 153–160 of: Advances in Neural Information Processing Systems.CrossRefGoogle Scholar
Bengio, Yoshua, Courville, Aaron, and Vincent, Pascal. 2013. Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8), 17981828.CrossRefGoogle ScholarPubMed
Bi, Jinbo, Xiong, Tao, Yu, Shipeng, Dundar, Murat, and Rao, R. Bharat. 2008. An Improved multi-task learning approach with applications in medical diagnosis. Pages 117–132 of: Proceedings of European Conference on Machine Learning and Practice of Knowledge Discovery in Databases.Google Scholar
Bickel, Steffen, Brückner, Michael, and Scheffer, Tobias. 2007. Discriminative learning for differing training and test distributions. Pages 81–88 of: Proceedings of the 24th International Conference on Machine Learning.Google Scholar
Bickel, Steffen, Bogojeska, Jasmina, Lengauer, Thomas, and Scheffer, Tobias. 2008. Multi-task learning for HIV therapy screening. Pages 56–63 of: Proceedings of the Twenty-Fifth International Conference on Machine Learning.Google Scholar
Biermann, Alan W., and Long, Philip M. 1996. The composition of messages in speech-graphics interactive systems. Pages 97–100 of: Proceedings of the 1996 International Symposium on Spoken Dialogue.Google Scholar
Blitzer, John, McDonald, Ryan, and Pereira, Fernando. 2006. Domain adaptation with structural correspondence learning. Pages 120–128 of: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing.Google Scholar
Blitzer, John, Crammer, Koby, Kulesza, Alex, Pereira, Fernando, and Wortman, Jennifer. 2007a. Learning bounds for domain adaptation. Pages 129–136 of: Advances in Neural Information Processing Systems.Google Scholar
Blitzer, John, Dredze, Mark, and Pereira, Fernando. 2007b. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. Pages 440–447 of: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics.Google Scholar
Blum, Avrim, and Mitchell, Tom M. 1998. Combining labeled and unlabeled data with co-training. Pages 92–100 of: Bartlett, Peter L., and Mansour, Yishay (eds.), Proceedings of the Eleventh Annual Conference on Computational Learning Theory.Google Scholar
Bollegala, Danushka, Maehara, Takanori, and Kawarabayashi, Kenichi. 2015. Unsupervised cross-domain word representation learning. Pages 730–740 of: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics.Google Scholar
Bonilla, Edwin V., Chai, Kian Ming Adam, and Williams, Christopher K. I. 2007. Multi-task Gaussian process prediction. Pages 153–160 of: Advances in Neural Information Processing Systems 20.Google Scholar
Bou-Ammar, Haitham, Tuyls, Karl, Taylor, Matthew E., Driessens, Kurt, and Weiss, Gerhard. 2012. Reinforcement learning transfer via sparse coding. Pages 383–390 of: Proceedings of International Conference on Autonomous Agents and Multiagent Systems.Google Scholar
Bou-Ammar, Haitham, Eaton, Eric, Ruvolo, Paul, and Taylor, Matthew E. 2014. Online multi-task learning for policy gradient methods. Pages 1206–1214 of: Proceedings of the 31th International Conference on Machine Learning.Google Scholar
Bou-Ammar, Haitham, Eaton, Eric, Ruvolo, Paul, and Taylor, Matthew E. 2015. Unsupervised cross-domain transfer in policy gradient reinforcement learning via manifold alignment. Pages 2504–2510 of: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence.Google Scholar
Bousmalis, Konstantinos, Trigeorgis, George, Silberman, Nathan, Krishnan, Dilip, and Erhan, Dumitru. 2016. Domain separation networks. Pages 343–351 of: Advances in Neural Information Processing Systems.Google Scholar
Bousquet, Olivier, and Elisseeff, André. 2002. Stability and generalization. Journal of Machine Learning Research, 2, 499526.Google Scholar
Braud, Chloé, Lacroix, Ophélie, and Søgaard, Anders. 2017. Cross-lingual and cross-domain discourse segmentation of entire documents. Pages 237–243 of: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.CrossRefGoogle Scholar
Bromley, Jane, Guyon, Isabelle, LeCun, Yann, Säckinger, Eduard, and Shah, Roopak. 1993. Signature verification using a Siamese time delay neural network. Pages 737–744 of: Advances in Neural Information Processing Systems.CrossRefGoogle Scholar
Brosch, Tom, and Tam, Roger C. 2013. Manifold learning of brain MRIs by deep learning. Pages 633–640 of: Proceedings of the 16th International Conference on Medical Image Computing and Computer-Assisted Intervention.Google Scholar
Brunskill, Emma, and Li, Lihong. 2013. Sample complexity of multi-task reinforcement learning. In: Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence.Google Scholar
Bruzzone, Lorenzo, and Marconcini, Mattia. 2010. Domain adaptation problems: A DASVM classification technique and a circular validation strategy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(5), 770787.CrossRefGoogle Scholar
Bryant, Peter E., and Trabasso, Thomas. 1971. Transitive inferences and memory in young children. Nature, 232, 456458.Google Scholar
Bulling, Andreas, and Roggen, Daniel. 2011. Recognition of visual memory recall processes using eye movement analysis. Pages 455–464 of: Proceedings of the 13th International Conference on Ubiquitous Computing.Google Scholar
Bulling, Andreas, Ward, Jamie A., Gellersen, Hans, and Tröster, Gerhard. 2008. Robust recognition of reading activity in transit using wearable electrooculography. Pages 19–37 of: Proceedings of the 6th International Conference on Pervasive Computing.Google Scholar
Bulling, Andreas, Blanke, Ulf, and Schiele, Bernt. 2014. A tutorial on human activity recognition using body-worn inertial sensors. ACM Computing Surveys, 46(3), 33:1–33:33.Google Scholar
Calandriello, Daniele, Lazaric, Alessandro, and Restelli, Marcello. 2014. Sparse multi-task reinforcement learning. Pages 819–827 of: Advances in Neural Information Processing Systems.Google Scholar
Cao, Qiong, Ying, Yiming, and Li, Peng. 2013. Similarity metric learning for face recognition. Pages 2408–2415 of: Proceedings of IEEE International Conference on Computer Vision.Google Scholar
Cao, Zhangjie, Long, Mingsheng, Wang, Jianmin, and Jordan, Michael I. 2017. Partial transfer learning with selective adversarial networks. CoRR, abs/1707.07901.Google Scholar
Carbonell, Jaime G. 1981. A computational model of analogical problem solving. Pages 147–152 of: Proceedings of the 7th International Joint Conference on Artificial Intelligence.Google Scholar
Carbonell, Jaime G., Etzioni, Oren, Gil, Yolanda, et al. 1991. PRODIGY: An integrated architecture for planning and learning. SIGART Bulletin, 2(4), 5155.Google Scholar
Carlson, Andrew, Betteridge, Justin, Kisiel, Bryan, et al. 2010. Toward an architecture for never-ending language learning. In: Proceedings of the 24th AAAI Conference on Artificial Intelligence.Google Scholar
Caruana, Rich. 1997. Multitask learning. Machine Learning, 28(1), 4175.Google Scholar
Casanueva, Inigo, Hain, Thomas, Christensen, Heidi, Marxer, Ricard, and Green, Phil. 2015. Knowledge transfer between speakers for personalised dialogue management. Pages 12–21 of: Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue.Google Scholar
Castrejon, Lluis, Aytar, Yusuf, Vondrick, Carl, Pirsiavash, Hamed, and Torralba, Antonio. 2016. Learning aligned cross-modal representations from weakly aligned data. Pages 2940–2949 of: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Cavallanti, Giovanni, Cesa-Bianchi, Nicolò, and Gentile, Claudio. 2010. Linear algorithms for online multitask classification. Journal of Machine Learning Research, 11, 29012934.Google Scholar
Chaudhuri, Kamalika, Monteleoni, Claire, and Sarwate, Anand D. 2011. Differentially private empirical risk minimization. Journal of Machine Learning Research, 12, 10691109.Google ScholarPubMed
Chavarriaga, Ricardo, Sagha, Hesam, Calatroni, Alberto, et al. 2013. The opportunity challenge: A benchmark database for on-body sensor-based activity recognition. Pattern Recognition Letters, 34(15), 20332042.Google Scholar
Chen, Austin H., and Huang, Zone-Wei. 2010. A new multi-task learning technique to predict classification of leukemia and prostate cancer. Pages 11–20 of: Proceedings of the Second International Conference on Medical Biometrics.Google Scholar
Chen, Jianhui, Tang, Lei, Liu, Jun, and Ye, Jieping. 2009. A convex formulation for learning shared structures from multiple tasks. Pages 137–144 of: Proceedings of the 26th International Conference on Machine Learning.Google Scholar
Chen, Jianhui, Liu, Ji, and Ye, Jieping. 2010a. Learning incoherent sparse and low-rank patterns from multiple tasks. Pages 1179–1188 of: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.CrossRefGoogle Scholar
Chen, Jianhui, Zhou, Jiayu, and Ye, Jieping. 2011. Integrating low-rank and group-sparse structures for robust multi-task learning. Pages 42–50 of: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Google Scholar
Chen, Minmin, Xu, Zhixiang, Sha, Fei, and Weinberger, Kilian Q. 2012a. Marginalized denoising autoencoders for domain adaptation. Pages 767–774 of: Proceedings of the 29th International Conference on Machine Learning.Google Scholar
Chen, Minmin, Xu, Z., Weinberger, Kilian Q., and Sha, Fei. 2012b. Marginalized stacked denoising autoencoders. In: Proceedings of the Learning Workshop.Google Scholar
Chen, Wei-Yu, Hsu, Tzu-Ming Harry, Tsai, Yao-Hung Hubert, Wang, Yu-Chiang Frank, and Chen, Ming-Syan. 2016a. Transfer neural trees for heterogeneous domain adaptation. Pages 399–414 of: Proceedings of European Conference on Computer Vision.Google Scholar
Chen, Xi, Duan, Yan, Houthooft, Rein, Schulman, John, Sutskever, Ilya, and Abbeel, Pieter. 2016b. InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets. Pages 2172–2180 of: Advances in Neural Information Processing Systems.Google Scholar
Chen, Yuqiang, Jin, Ou, Xue, Gui-Rong, Chen, Jia, and Yang, Qiang. 2010b. Visual contextual advertising: Bringing textual advertisements to images. In: Proceedings of 24th AAAI Conference on Artificial Intelligence.Google Scholar
Chen, Zhiyuan, and Liu, Bing. 2016. Lifelong Machine Learning. Morgan & Claypool.Google Scholar
Chen, Zhiyuan, Ma, Nianzu, and Liu, Bing. 2015. Lifelong learning for sentiment classification. Pages 750–756 of: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics.CrossRefGoogle Scholar
Cheng, Heng-Tze, Koc, Levent, Harmsen, Jeremiah, et al. 2016. Wide & deep learning for recommender systems. Pages 7–10 of: Proceedings of the 1st Workshop on Deep Learning for Recommender Systems.Google Scholar
Choi, Eunsol, Hewlett, Daniel, Uszkoreit, Jakob, et al. 2017. Coarse-to-fine question answering for long documents. Pages 209–220 of: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.Google Scholar
Chomsky, Noam. 1956. Three models for the description of language. IRE Transactions on Information Theory, 2(3), 113124.Google Scholar
Cilibrasi, Rudi, and Vitányi, Paul M. B. 2007. The Google similarity distance. IEEE Transactions on Knowledge and Data Engineering, 19(3), 370383.Google Scholar
Collobert, Ronan, and Weston, Jason. 2008. A unified architecture for natural language processing: Deep neural networks with multitask learning. Pages 160–167 of: Proceedings of the 25th International Conference on Machine Learning.Google Scholar
Conneau, Alexis, Kiela, Douwe, Schwenk, Holger, Barrault, Loïc, and Bordes, Antoine. 2017. Supervised learning of universal sentence representations from natural language inference data. Pages 670–680 of: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.Google Scholar
Cortes, Corinna, Mansour, Yishay, and Mohri, Mehryar. 2010. Learning bounds for importance weighting. Pages 442–450 of: Advances in Neural Information Processing Systems.Google Scholar
Cortes, Corinna, Mohri, Mehryar, and Medina, Andres Muñoz. 2015. Adaptation algorithm and theory based on generalized discrepancy. Pages 169–178 of: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Google Scholar
Covington, Paul, Adams, Jay, and Sargin, Emre. 2016. Deep neural networks for YouTube recommendations. Pages 191–198 of: Proceedings of the 10th ACM Conference on Recommender Systems.Google Scholar
Crammer, Koby, and Mansour, Yishay. 2012. Learning multiple tasks using shared hypotheses. Pages 1484–1492 of: Advances in Neural Information Processing Systems.Google Scholar
Cree, V., and Macaulay, . 2000. Transfer of Learning in Professional and Vocational Education. Routledge.Google Scholar
Csurka, Gabriela. 2017. Domain adaptation for visual applications: A comprehensive survey. CoRR, abs/1702.05374.Google Scholar
da Silva, Bruno Castro, Konidaris, George, and Barto, Andrew G. 2012. Learning parameterized skills. Proceedings of the 29th International Conference on Machine Learning.Google Scholar
Dahlmeier, Daniel, and Ng, Hwee Tou. 2010. Domain adaptation for semantic role labeling in the biomedical domain. Bioinformatics, 26(8), 10981104.Google Scholar
Dai, Wenyuan, Xue, Gui-Rong, Yang, Qiang, and Yu, Yong. 2007a. Transferring naive Bayes classifiers for text classification. Pages 540–545 of: Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence.Google Scholar
Dai, Wenyuan, Yang, Qiang, Xue, Gui-Rong, and Yu, Yong. 2007b. Boosting for transfer learning. Pages 193–200 of: Proceedings of the 24th International Conference on Machine Learning.CrossRefGoogle Scholar
Dai, Wenyuan, Chen, Yuqiang, Xue, Gui-Rong, Yang, Qiang, and Yu, Yong. 2008. Translated learning: Transfer learning across different feature spaces. Pages 353–360 of: Advances in Neural Information Processing Systems.Google Scholar
Das, Abhinandan S., Datar, Mayur, Garg, Ashutosh, and Rajaram, Shyam. 2007. Google news personalization: Scalable online collaborative filtering. Pages 271–280 of: Proceedings of the 16th International Conference on World Wide Web.Google Scholar
Daumé III, Hal. 2007. Frustratingly easy domain adaptation. Pages 256–263 of: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics.Google Scholar
Davis, Jesse, and Domingos, Pedro. 2009. Deep transfer via second-order Markov logic. Pages 217–224 of: Proceedings of the 26th International Conference on Machine Learning.Google Scholar
Dekel, Ofer, Long, Philip M., and Singer, Yoram. 2006. Online multitask learning. Pages 453–467 of: Proceedings of the 19th Annual Conference on Learning Theory.CrossRefGoogle Scholar
Dekel, Ofer, Long, Philip M., and Singer, Yoram. 2007. Online learning of multiple tasks with a shared loss. Journal of Machine Learning Research, 8, 22332264.Google Scholar
Dempster, A. P., Laird, N. M., and Rubin, D. B. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, 39(1), 138.Google Scholar
Deng, Wan-Yu, Zheng, Qing-Hua, and Wang, Zhong-Min. 2014. Cross-person activity recognition using reduced kernel extreme learning machine. Neural Networks, 53, 17.CrossRefGoogle ScholarPubMed
Denton, Emily L., Chintala, Soumith, Fergus, Rob, et al. 2015. Deep generative image models using a Laplacian pyramid of adversarial networks. Pages 1486–1494 of: Advances in Neural Information Processing Systems.Google Scholar
Devin, Coline, Gupta, Abhishek, Darrell, Trevor, Abbeel, Pieter, and Levine, Sergey. 2017. Learning modular neural network policies for multi-task and multi-robot transfer. Pages 2169–2176 of: Proceedings of IEEE International Conference on Robotics and Automation.Google Scholar
Devlin, Jacob, Chang, Ming-Wei, Lee, Kenton, and Toutanova, Kristina. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. CoRR, abs/1810.04805.Google Scholar
Dietterich, Thomas G., Lathrop, Richard H., and Lozano-Pérez, Tomás. 1997. Solving the multiple instance problem with axis-parallel rectangles. Artificial Intelligence, 89(1–2), 3171.CrossRefGoogle Scholar
Donahue, Jeff, Hoffman, Judy, Rodner, Erik, Saenko, Kate, and Darrell, Trevor. 2013. Semi-supervised domain adaptation with instance constraints. Pages 668–675 of: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Donahue, Jeff, Jia, Yangqing, Vinyals, Oriol, et al. 2014. DeCAF: A deep convolutional activation feature for generic visual recognition. Pages 647–655 of: Proceedings of the 31th International Conference on Machine Learning.Google Scholar
Donahue, Jeff, Krähenbühl, Philipp, and Darrell, Trevor. 2016. Adversarial feature learning. CoRR, abs/1605.09782.Google Scholar
Dong, Daxiang, Wu, Hua, He, Wei, Yu, Dianhai, and Wang, Haifeng. 2015. Multi-task learning for multiple language translation. Pages 1723–1732 of: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing.Google Scholar
Dönnes, Pierre, and Elofsson, Arne. 2002. Prediction of MHC Class I binding peptides, using SVMHC. BMC Bioinformatics, 3, 25.Google Scholar
Dou, Qi, Ouyang, Cheng, Chen, Cheng, Chen, Hao, and Heng, Pheng-Ann. 2018. Unsupervised cross-modality domain adaptation of ConvNets for biomedical image segmentations with adversarial loss. Pages 691–697 of: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence.Google Scholar
Drummond, Chris. 2002. Accelerating reinforcement learning by composing solutions of automatically identified subtasks. Journal of Artificial Intelligence Research, 16, 59104.Google Scholar
Duan, Lixin, Tsang, Ivor W., Xu, Dong, and Maybank, Stephen J. 2009. Domain transfer SVM for video concept detection. Pages 1375–1381 of: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Duan, Lixin, Tsang, Ivor W., and Dong, Xu. 2012a. Domain transfer multiple kernel learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(3), 465479.Google Scholar
Duan, Lixin, Xu, Dong, and Tsang, Ivor W. 2012b. Learning with augmented features for heterogeneous domain adaptation. Pages 711–718 of: Proceedings of International Conference on Machine Learning.Google Scholar
Duan, Lixin, Dong, Xu, Tsang, Ivor Wai-Hung, and Luo, Jiebo. 2012c. Visual event recognition in videos by learning from web data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(9), 16671680.Google Scholar
Dumoulin, Vincent, Belghazi, Ishmael, Poole, Ben, et al. 2016. Adversarially learned inference. CoRR, abs/1606.00704.Google Scholar
Duong, Long, Cohn, Trevor, Bird, Steven, and Cook, Paul. 2015. Low resource dependency parsing: Cross-lingual parameter sharing in a neural network parser. Pages 845–850 of: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing.Google Scholar
Dwork, Cynthia. 2008. Differential privacy: A survey of results. Pages 1–19 of: Proceedings of the 5th Annual Conference on Theory and Applications of Models of Computation.Google Scholar
Dwork, Cynthia, and Roth, Aaron. 2014. The algorithmic foundations of differential privacy. Foundations and Trends in Theoretical Computer Science, 9(3–4), 211407.Google Scholar
Dwork, Cynthia, Kenthapadi, Krishnaram, McSherry, Frank, Mironov, Ilya, and Naor, Moni. 2006a. Our data, ourselves: Privacy via distributed noise generation. Pages 486–503 of: Proceedings of the 25th Annual International Conference on the Theory and Applications of Cryptographic Techniques.Google Scholar
Dwork, Cynthia, McSherry, Frank, Nissim, Kobbi, and Smith, Adam D. 2006b. Calibrating noise to sensitivity in private data analysis. Pages 265–284 of: Proceedings of the 3rd Theory of Cryptography Conference.Google Scholar
Ellis, Henry Carlton. 1965. The Transfer of Learning. MacMillan.Google Scholar
Elman, Jeffrey L. 1993. Learning and development in neural networks: The importance of starting small. Cognition, 48(1), 7199.Google Scholar
Emekçi, Fatih, Sahin, Ozgur D., Agrawal, Divyakant, and Abbadi, El, Amr, . 2007. Privacy preserving decision tree learning over multiple parties. Data and Knowledge Engineering, 63(2), 348361.Google Scholar
Esteva, Andre, Kuprel, Brett, Novoa, Roberto A., et al. 2017. Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542(7639), 115118.Google Scholar
Evgeniou, A., and Pontil, Massimiliano. 2007. Multi-task feature learning. Advances in Neural Information Processing Systems, 19, 41.Google Scholar
Evgeniou, Theodoros, and Pontil, Massimiliano. 2004. Regularized multi-task learning. Pages 109–117 of: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Google Scholar
Evgeniou, Theodoros, Micchelli, Charles A., and Pontil, Massimiliano. 2005. Learning multiple tasks with Kernel methods. Journal of Machine Learning Research, 6, 615637.Google Scholar
Fan, Jianqing, and Li, Runze. 2006. Statistical challenges with high dimensionality: Feature selection in knowledge discovery. arXiv, arXiv:math/0602133.Google Scholar
Fan, Xing, Monti, Emilio, Mathias, Lambert, and Dreyer, Markus. 2017. Transfer learning for neural semantic parsing. Pages 48–56 of: Proceedings of the 2nd Workshop on Representation Learning for NLP.Google Scholar
Fang, Meng, and Cohn, Trevor. 2017. Model transfer for tagging low-resource languages using a bilingual dictionary. Pages 587–593 of: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.Google Scholar
Fang, Meng, and Tao, Dacheng. 2015. Active multi-task learning via bandits. Pages 505–513 of: Proceedings of the 2015 SIAM International Conference on Data Mining.Google Scholar
Fang, Meng, Yin, Jie, and Zhu, Xingquan. 2013. Transfer learning across networks for collective classification. Pages 161–170 of: Proceedings of IEEE International Conference on Data Mining.Google Scholar
Fang, Meng, Yin, Jie, Zhu, Xingquan, and Zhang, Chengqi. 2015. TrGraph: Cross-network transfer learning via common signature subgraphs. IEEE Transactions on Knowledge and Data Engineering, 27(9), 25362549.Google Scholar
Ferguson, Kimberly, and Mahadevan, Sridhar. 2006. Proto-transfer learning in Markov decision processes using spectral methods. Proceedings of ICML Workshop on Transfer Learning.Google Scholar
Ferns, Norm, Panangaden, Prakash, and Precup, Doina. 2004. Metrics for finite Markov decision processes. Pages 162–169 of: Proceedings of the 20th Conference in Uncertainty in Artificial Intelligence.Google Scholar
Feurer, Matthias, Klein, Aaron, Eggensperger, Katharina, et al. 2015. Efficient and robust automated machine learning. Pages 2962–2970 of: Advances in Neural Information Processing Systems 28.Google Scholar
Firat, Orhan, Sankaran, Baskaran, Al-Onaizan, Yaser, et al. 2016. Zero-resource translation with multi-lingual neural machine translation. Pages 268–277 of: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.Google Scholar
Fong, Pui Kuen, and Weber-Jahnke, Jens H. 2012. Privacy preserving decision tree learning using unrealized data sets. IEEE Transactions on Knowledge and Data Engineering, 24(2), 353364.Google Scholar
Forbus, Kenneth D., Gentner, Dedre, Markman, Arthur B., and Ferguson, Ronald W. 1998. Analogy just looks like high level perception: Why a domain-general approach to analogical mapping is right. Journal of Experimental and Theoretical Artificial Intelligence, 10(2), 231257.Google Scholar
Friedman, Jerome, Hastie, Trevor, and Tibshirani, Robert. 2001. The Elements of Statistical Learning. Springer.Google Scholar
Frome, Andrea, Corrado, Gregory S., Shlens, Jonathon, et al. 2013. DeViSE: A deep visual-semantic embedding model. Pages 2121–2129 of: Advances in Neural Information Processing Systems.Google Scholar
Ganguly, Soumyajit, and Pudi, Vikram. 2017. Paper2vec: Combining graph and text information for scientific paper representation. Pages 383–395 of: Proceedings of European Conference on Information Retrieval.Google Scholar
Ganin, Yaroslav, and Lempitsky, Victor. 2015. Unsupervised domain adaptation by back-propagation. Pages 1180–1189 of: Proceedings of the 32nd International Conference on Machine Learning.Google Scholar
Ganin, Yaroslav, Ustinova, Evgeniya, Ajakan, Hana, et al. 2016. Domain-adversarial training of neural networks. Journal of Machine Learning Research, 17, 20962030.Google Scholar
Gao, Chen, Chen, Xiangning, Feng, Fuli, et al. 2019. Cross-domain recommendation without sharing user-relevant data. Pages 491–502 of: Proceedings of the 2019 World Wide Web Conference on World Wide Web.Google Scholar
Gao, Sheng, Luo, Hao, Chen, Da, et al. 2013. Cross-domain recommendation via cluster-level latent factor model. Pages 161–176 of: Proceedings of the European Conference on Machine Learning and Practice of Knowledge Discovery in Databases.Google Scholar
Gašić, M., Kim, Dongho, Tsiakoulis, Pirros, and Young, Steve. 2015a. Distributed dialogue policies for multi-domain statistical dialogue management. Pages 5371–5375 of: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing.Google Scholar
Gašić, M., Mrkšic, N., Barahona, L. Rojas, et al. 2015b. Multi-agent learning in multi-domain spoken dialogue systems. In: Proceedings of NIPS workshop on Spoken Language Understanding and Interaction.Google Scholar
Gašić, Milica, Breslin, Catherine, Henderson, Matthew, et al. 2013. POMDP-based dialogue manager adaptation to extended domains. In: Proceedings of the 14th Annual Meeting of the Special Interest Group on Discourse and Dialogue.Google Scholar
Gašić, Milica, Kim, Dongho, Tsiakoulis, Pirros, et al. 2014. Incremental on-line adaptation of POMDP-based dialogue managers to extended domains. Pages 140–144 of: Proceedings of the 15th Annual Conference of the International Speech Communication Association.Google Scholar
Gašić, Milica, Mrkšic, Nikola, Su, Pei-hao, et al. 2015c. Policy committee for adaptation in multi-domain spoken dialogue systems. Pages 806–812 of: Proceedings of 2015 IEEE Workshop on Automatic Speech Recognition and Understanding.Google Scholar
Gatys, Leon A., Ecker, Alexander S., and Bethge, Matthias. 2016. Image style transfer using convolutional neural networks. Pages 2414–2423 of: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Genevay, Aude, and Laroche, Romain. 2016. Transfer learning for user adaptation in spoken dialogue systems. Pages 975–983 of: Proceedings of the 2016 International Conference on Autonomous Agents and Multiagent Systems.Google Scholar
Germain, Pascal, Habrard, Amaury, Laviolette, François, and Morvant, Emilie. 2013. A PAC-Bayesian approach for domain adaptation with specialization to linear classifiers. Pages 738–746 of: Proceedings of the 30th International Conference on Machine Learning.Google Scholar
Getoor, Lise, and Taskar, Ben. 2007. Introduction to Statistical Relational Learning. MIT Press.Google Scholar
Ghifary, Muhammad, Bastiaan Kleijn, W., Zhang, Mengjie, and Balduzzi, David. 2015. Domain generalization for object recognition with multi-task autoencoders. Pages 2551–2559 of: Proceedings of the IEEE International Conference on Computer Vision.Google Scholar
Ghifary, Muhammad, Kleijn, W. Bastiaan, Zhang, Mengjie, Balduzzi, David, and Li, Wen. 2016. Deep reconstruction-classification networks for unsupervised domain adaptation. Pages 597–613 of: Proceedings of European Conference on Computer Vision.Google Scholar
Gillick, Dan, Brunk, Cliff, Vinyals, Oriol, and Subramanya, Amarnag. 2016. Multilingual language processing from bytes. Pages 1296–1306 of: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.Google Scholar
Giorgi, John M., and Bader, Gary. 2018. Transfer learning for biomedical named entity recognition with neural networks. Bioinformatics, 34(23), 40874094.Google Scholar
Girshick, Ross, Donahue, Jeff, Darrell, Trevor, and Malik, Jitendra. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. Pages 580–587 of: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Glorot, Xavier, and Bengio, Yoshua. 2010. Understanding the difficulty of training deep feedforward neural networks. Pages 249–256 of: Proceedings of International Conference on Artificial Intelligence and Statistics.Google Scholar
Glorot, Xavier, Bordes, Antoine, and Bengio, Yoshua. 2011. Domain adaptation for large-scale sentiment classification: A deep learning approach. Pages 513–520 of: Proceedings of the 28th International Conference on Machine Learning.Google Scholar
Gong, Boqing, Shi, Yuan, Sha, Fei, and Grauman, Kristen. 2012a. Geodesic flow kernel for unsupervised domain adaptation. Pages 2066–2073 of: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Gong, Pinghua, Ye, Jieping, and Zhang, Changshui. 2012b. Robust multi-task feature learning. Pages 895–903 of: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.CrossRefGoogle Scholar
Gong, Pinghua, Ye, Jieping, and Zhang, Changshui. 2013. Multi-stage multi-task feature learning. Journal of Machine Learning Research, 14, 29793010.Google Scholar
Goodfellow, Ian, Pouget-Abadie, Jean, Mirza, Mehdi, et al. 2014. Generative adversarial nets. Pages 2672–2680 of: Advances in Neural Information Processing Systems.Google Scholar
Gopalan, Raghuraman, Li, Ruonan, and Chellappa, Rama. 2011. Domain adaptation for object recognition: An unsupervised approach. Pages 999–1006 of: Proceedings of IEEE International Conference on Computer Vision.Google Scholar
Gopalan, Raghuraman, Li, Ruonan, and Chellappa, Rama. 2014. Unsupervised adaptation across domain shifts by generating intermediate data representations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(11), 22882302.Google Scholar
Görnitz, Nico, Widmer, Christian, Zeller, Georg, et al. 2011. Hierarchical multitask structured output learning for large-scale sequence segmentation. Pages 2690–2698 of: Advances in Neural Information Processing Systems.Google Scholar
Gouws, Stephan, Bengio, Yoshua, and Corrado, Greg. 2015. BilBOWA: Fast bilingual distributed representations without word alignments. Pages 748–756 of: Proceedings of the 32nd International Conference on Machine Learning.Google Scholar
Gretton, Arthur, Bousquet, Olivier, Smola, Alex, and Schölkopf, Bernhard. 2005. Measuring statistical dependence with Hilbert-Schmidt norms. Pages 63–77 of: Proceedings of International Conference on Algorithmic Learning Theory.Google Scholar
Gretton, Arthur, Borgwardt, Karsten M., Rasch, Malte, Schölkopf, Bernhard, and Smola, Alex J. 2007. A kernel method for the two-sample-problem. Pages 513–520 of: Advances in Neural Information Processing Systems.Google Scholar
Gretton, Arthur, Sejdinovic, Dino, Strathmann, Heiko, et al. 2012. Optimal kernel choice for large-scale two-sample tests. Pages 1214–1222 of: Advances in Neural Information Processing Systems.Google Scholar
Guo, Bin, Li, Jing, Zheng, Vincent W., Wang, Zhu, and Yu, Zhiwen. 2018a. CityTransfer: Transferring inter- and intra-city knowledge for chain store site recommendation based on multi-source urban data. Pages 135:1–135:23 of: Proceeding of the 2018 ACM International Joint Conference on Pervasive and Ubiquitous Computing.Google Scholar
Guo, Jiang, Che, Wanxiang, Wang, Haifeng, and Liu, Ting. 2016a. Exploiting multi-typed treebanks for parsing with deep multi-task learning. CoRR, abs/1606.01161.Google Scholar
Guo, Jiang, Che, Wanxiang, Wang, Haifeng, and Liu, Ting. 2016b. A universal framework for inductive transfer parsing across multi-typed treebanks. Pages 12–22 of: Proceedings of the 26th International Conference on Computational Linguistics.Google Scholar
Guo, Xiawei, Yao, Quanming, Tu, Wei-Wei, et al. 2018b. Privacy-preserving transfer learning for knowledge sharing. CoRR, abs/1811.09491.Google Scholar
Guo, Zhenyu, and Wang, Z. Jane. 2013. Cross-domain object recognition via input-output Kernel analysis. IEEE Transactions on Image Processing, 22(8), 31083119.Google Scholar
Gupta, Sunil Kumar, Phung, Dinh, Adams, Brett, Tran, Truyen, and Venkatesh, Svetha. 2010. Nonnegative shared subspace learning and its application to social media retrieval. Pages 1169–1178 of: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Google Scholar
Haeberlen, Andreas, Flannery, Eliot, Ladd, Andrew M., et al. 2004. Practical robust localization over large-scale 802.11 wireless networks. Pages 70–84 of: Proceedings of the 10th Annual International Conference on Mobile Computing and Networking.Google Scholar
Ham, Ji Hun, Lee, Daniel D., and Saul, Lawrence K. 2003. Learning high dimensional correspondences from low dimensional manifolds. Proceedings of ICML Workshop on the Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining.Google Scholar
Hamm, Jihun, Cao, Yingjun, and Belkin, Mikhail. 2016. Learning privately from multiparty data. Pages 555–563 of: Proceedings of the 33rd International Conference on Machine Learning.Google Scholar
Hammerla, Nils Y., Halloran, Shane, and Plötz, Thomas. 2016. Deep, convolutional, and recurrent models for human activity recognition using wearables. Pages 1533–1540 of: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence.Google Scholar
Han, Jiawei, and Kamber, Micheline. 2000. Data Mining: Concepts and Techniques. Morgan Kaufmann.Google Scholar
Han, Lei, and Zhang, Yu. 2015a. Learning multi-level task groups in multi-task learning. Pages 2638–2644 of: Proceedings of the 29th AAAI Conference on Artificial Intelligence.Google Scholar
Han, Lei, and Zhang, Yu. 2015b. Learning tree structure in multi-task learning. Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining.Google Scholar
Han, Lei, and Zhang, Yu. 2016. Multi-stage multi-task learning with reduced rank. Pages 1638–1644 of: Proceedings of the 30th AAAI Conference on Artificial Intelligence.Google Scholar
Han, Lei, Zhang, Yu, Song, Guojie, and Xie, Kunqing. 2014. Encoding tree sparsity in multi-task learning: A probabilistic framework. Pages 1854–1860 of: Proceedings of the 28th AAAI Conference on Artificial Intelligence.Google Scholar
Harris, Zellig S. 1954. Distributional structure. Word, 10(2–3), 146162.Google Scholar
Hashimoto, Kazuma, Tsuruoka, Yoshimasa, Socher, Richard, et al. 2017. A joint many-task model: Growing a neural network for multiple NLP tasks. Pages 1923–1933 of: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.Google Scholar
Hausknecht, Matthew J., and Stone, Peter. 2015. Deep recurrent Q-learning for partially observable MDPs. CoRR, abs/1507.06527.Google Scholar
He, Jia, Liu, Rui, Zhuang, Fuzhen, et al. 2018a. A general cross-domain recommendation framework via Bayesian neural network. Pages 1001–1006 of: Proceedings of the 2018 IEEE International Conference on Data Mining.Google Scholar
He, Jingrui, and Lawrence, Rick. 2011, . A graph-based framework for multi-task multi-view learning. Pages 25–32 of: Proceedings of the 28th International Conference on Machine Learning.Google Scholar
He, Kaiming, Zhang, Xiangyu, Ren, Shaoqing, and Jian, Sun. 2016. Identity mappings in deep residual networks. Pages 630–645 of: European Conference on Computer Vision.Google Scholar
He, Kaiming, Girshick, Ross B., and Dollár, Piotr. 2018b. Rethinking ImageNet pre-training. CoRR, abs/1811.08883.Google Scholar
He, Yulan, and Young, Steve. 2006. Spoken language understanding using the hidden vector state model. Speech Communication, 48(3), 262275.CrossRefGoogle Scholar
Henderson, Matthew, Gašić, Milica, Thomson, Blaise, et al. 2012. Discriminative spoken language understanding using word confusion networks. Pages 176–181 of: Proceedings of IEEE Spoken Language Technology Workshop.Google Scholar
Henderson, Matthew, Thomson, Blaise, and Young, Steve. 2014. Word-based dialog state tracking with recurrent neural networks. Pages 292–299 of: Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue.Google Scholar
Hengst, Bernhard. 2002. Discovering hierarchy in reinforcement learning with HEXQ. Pages 243–250 of: Proceedings of the Nineteenth International Conference on Machine Learning.Google Scholar
Hernández-Lobato, Daniel, and Hernández-Lobato, José Miguel. 2013. Learning feature selection dependencies in multi-task learning. Pages 746–754 of: Advances in Neural Information Processing Systems.Google Scholar
Hernández-Lobato, Daniel, Hernández-Lobato, José Miguel, and Ghahramani, Zoubin. 2015. A probabilistic model for dirty multi-task feature selection. Pages 1073–1082 of: Proceedings of the 32nd International Conference on Machine Learning.Google Scholar
Hinrichs, Thomas R., and Forbus, Kenneth D. 2011. Transfer learning through analogy in games. AI Magazine, 32(1), 7083.Google Scholar
Hoffman, Judy, Rodner, Erik, Donahue, Jeff, Saenko, Kate, and Darrell, Trevor. 2013. Efficient learning of domain-invariant image representations. CoRR, abs/1301.3224.Google Scholar
Hoffman, Judy, Guadarrama, Sergio, Tzeng, Eric S., et al. 2014. LSDA: Large scale detection through adaptation. Pages 3536–3544 of: Advances in Neural Information Processing Systems.Google Scholar
Hofmann, Thomas. 1999. Probabilistic latent semantic analysis. Pages 289–296 of: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence.Google Scholar
Holyoak, Keith J., and Thagard, Paul. 1989. Analogical mapping by constraint satisfaction. Cognitive Science, 13(3), 295355.Google Scholar
Hu, Guangneng, Zhang, Yu, and Yang, Qiang. 2018. CoNet: Collaborative cross networks for cross-domain recommendation. Pages 667–676 of: Proceedings of the 27th ACM International Conference on Information and Knowledge Management.Google Scholar
Hu, Guangneng, Zhang, Yu, and Yang, Qiang. 2019. Transfer meets hybrid: A synthetic approach for cross-domain collaborative filtering with text. Pages 2822–2829 of: Proceedings of the Web Conference.Google Scholar
Hu, Minqing, and Liu, Bing. 2004. Mining and summarizing customer reviews. Pages 168– 177 of: Proceedings of the tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Google Scholar
Huang, Jiayuan, Smola, Alexander J., Gretton, Arthur, Borgwardt, Karsten M., and Schölkopf, Bernhard. 2006. Correcting sample selection bias by unlabeled data. Pages 601–608 of: Advances in Neural Information Processing Systems.Google Scholar
Huang, Jui-Ting, Li, Jinyu, Dong, Yu, Deng, Li, and Gong, Yifan. 2013. Cross-language knowledge transfer using multilingual deep neural network with shared hidden layers. Pages 7304–7308 of: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing.Google Scholar
Huber, Peter J. 1964. Robust estimation of a location parameter. The Annals of Mathematical Statistics, 35(1), 73101.Google Scholar
Isola, Phillip, Zhu, Jun-Yan, Zhou, Tinghui, and Efros, Alexei A. 2017. Image-to-image translation with conditional adversarial networks. Pages 1125–1134 of: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Jacob, Laurent, and Vert, Jean-Philippe. 2007. Efficient peptide-MHC-I binding prediction for alleles with few known binders. Bioinformatics, 24(3), 358366.Google Scholar
Jacob, Laurent, Bach, Francis R., and Vert, Jean-Philippe. 2008. Clustered multi-task learning: A convex formulation. Pages 745–752 of: Advances in Neural Information Processing Systems.Google Scholar
Jagannathan, Geetha, Pillaipakkamnatt, Krishnan, and Wright, Rebecca N. 2012. A practical differentially private random decision tree classifier. Transactions on Data Privacy, 5(1), 273295.Google Scholar
Jalali, Ali, Ravikumar, Pradeep, Sanghavi, Sujay, and Ruan, Chao. 2010. A dirty model for multi-task learning. Pages 964–972 of: Advances in Neural Information Processing Systems 23.Google Scholar
Jean, Neal, Burke, Marshall, Xie, Michael, et al. 2016. Combining satellite imagery and machine learning to predict poverty. Science, 353(6301), 790794.Google Scholar
Jeffreys, Harold. 1946. An invariant form for the prior probability in estimation problems. Proceedings of the Royal Society of London. Series A, Mathematical and Physical Sciences, 86(1007).Google Scholar
Jeong, Minwoo, and Lee, Gary Geunbae. 2009. Multi-domain spoken language understanding with transfer learning. Speech Communication, 51(5), 412424.Google Scholar
Jernite, Yacine, Bowman, Samuel R., and Sontag, David. 2017. Discourse-based objectives for fast unsupervised sentence representation learning. CoRR, abs/1705.00557.Google Scholar
Ji, Zhanglong, Jiang, Xiaoqian, Wang, Shuang, Xiong, Li, and Ohno-Machado, Lucila. 2014. Differentially private distributed logistic regression using private and public data. BMC Medical Genomics, 7(1), S14.Google Scholar
Jia, Yangqing, Salzmann, Mathieu, and Darrell, Trevor. 2010. Factorized latent spaces with structured sparsity. Pages 982–990 of: Advances in Neural Information Processing Systems.Google Scholar
Jiang, Jing. 2009. Multi-task transfer learning for weakly-supervised relation extraction. Pages 1012–1020 of: Proceedings of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the AFNLP.Google Scholar
Jiang, Jing, and Zhai, Chengxiang. 2007. Instance weighting for domain adaptation in NLP. Pages 264–271 for: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics.Google Scholar
Jiang, Wei, Zavesky, Eric, Chang, Shih-Fu, and Loui, Alex. 2008. Cross-domain learning methods for high-level visual concept classification. Pages 161–164 of: Proceedings of the 15th IEEE International Conference on Image Processing.Google Scholar
Jie, Luo, Tommasi, Tatiana, and Caputo, Barbara. 2011. Multiclass transfer learning from unconstrained priors. Pages 1863–1870 of: Proceedings of IEEE International Conference on Computer Vision.Google Scholar
Joachims, Thorsten. 1999. Transductive inference for text classification using support vector machines. Pages 200–209 of: Proceedings of the Sixteenth International Conference on Machine Learning.Google Scholar
Johnson, Justin, Alahi, Alexandre, and Fei-Fei, Li. 2016a. Perceptual losses for real-time style transfer and super-resolution. Pages 694–711 of: Proceedings of European Conference on Computer Vision.Google Scholar
Johnson, Melvin, Schuster, Mike, Le, Quoc V., et al. 2016b. Google’s multilingual neural machine translation system: Enabling zero-shot translation. CoRR, abs/1611.04558.Google Scholar
Joshi, Chaitanya K., Mi, Fei, and Faltings, Boi. 2017. Personalization in goal-oriented dialog. CoRR, abs/1706.07503.Google Scholar
Juba, Brendan. 2006. Estimating relatedness via data compression. Pages 441–448 of: Proceedings of the 23rd International Conference on Machine Learning.Google Scholar
Kakade, Sham M., Shalev-Shwartz, Shai, and Tewari, Ambuj. 2012. Regularization techniques for learning with matrices. Journal of Machine Learning Research, 13, 18651890.Google Scholar
Kalousis, Alexandros, Prados, Julien, and Hilario, Melanie. 2007. Stability of feature selection algorithms: A study on high-dimensional spaces. Knowledge and Information Systems, 12(1), 95116.Google Scholar
Kanagawa, Heishiro, Kobayashi, Hayato, Shimizu, Nobuyuki, Tagami, Yukihiro, and Suzuki, Taiji. 2019. Cross-domain recommendation via deep domain adaptation. Pages 20–29 of: Proceedings of the 41st European Conference on Information Retrieval.Google Scholar
Kanamori, Takafumi, Hido, Shohei, and Sugiyama, Masashi. 2009. A least-squares approach to direct importance estimation. Journal of Machine Learning Research, 10, 13911445.Google Scholar
Kang, Zhuoliang, Grauman, Kristen, and Sha, Fei. 2011. Learning with whom to share in multi-task feature learning. Pages 521–528 of: Proceedings of the 28th International Conference on Machine Learning.Google Scholar
Karpathy, Andrej, Toderici, George, Shetty, Sanketh, et al. 2014. Large-scale video classification with convolutional neural networks. Pages 1725–1732 of: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Kashima, Hisashi, Yamanishi, Yoshihiro, Kato, Tsuyoshi, Sugiyama, Masashi, and Tsuda, Koji. 2009. Simultaneous inference of biological networks of multiple species from genome-wide data and evolutionary information: A semi-supervised approach. Bioinformatics, 25(22), 29622968.Google Scholar
Katiyar, Arzoo, and Cardie, Claire. 2017. Going out on a limb: Joint extraction of entity mentions and relations without dependency trees. Pages 917–928 of: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.Google Scholar
Kato, Tsuyoshi, Kashima, Hisashi, Sugiyama, Masashi, and Asai, Kiyoshi. 2007. Multi-task learning via conic programming. Pages 737–744 of: Advances in Neural Information Processing Systems.Google Scholar
Kato, Tsuyoshi, Kashima, Hisashi, Sugiyama, Masashi, and Asai, Kiyoshi. 2010a. Conic Programming for multitask learning. IEEE Transactions on Knowledge and Data Engineering, 22(7), 957968.Google Scholar
Kato, Tsuyoshi, Okada, Kinya, Kashima, Hisashi, and Sugiyama, Masashi. 2010b. A transfer learning approach and selective integration of multiple types of assays for biological network inference. International Journal of Knowledge Discovery in Bioinformatics, 1(1), 6680.Google Scholar
Keogh, Eamonn J., and Pazzani, Michael J. 2000. Scaling up dynamic time warping for datamining applications. Pages 285–289 of: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Google Scholar
Kermany, Daniel S., Goldbaum, Michael, Cai, Wenjia, et al. 2018. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell, 172(5), 11221131.Google Scholar
Khan, Md Abdullah Al Hafiz, Roy, Nirmalya, and Misra, Archan. 2018. Scaling human activity recognition via deep learning-based domain adaptation. Pages 1–9 for: Proceedings of IEEE International Conference on Pervasive Computing and Communications.Google Scholar
Khosla, Aditya, Zhou, Tinghui, Malisiewicz, Tomasz, Efros, Alexei A., and Torralba, Antonio. 2012. Undoing the damage of dataset bias. Pages 158–171 of: Proceedings of European Conference on Computer Vision.Google Scholar
Kim, Edward, Corte-Real, Miguel, and Baloch, Zubair. 2016. A deep semantic mobile application for thyroid cytopathology. Proceedings of Medical Imaging 2016: PACS and Imaging Informatics: Next Generation and Innovations.Google Scholar
Kim, Taeksoo, Cha, Moonsu, Kim, Hyunsoo, Lee, Jung Kwon, and Kim, Jiwon. 2017. Learning to discover cross-domain relations with generative adversarial networks. Pages 1857–1865 of: Proceedings of International Conference on Machine Learning.Google Scholar
Kim, Yoon. 2014. Convolutional neural networks for sentence classification. Pages 1746–1751 of: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing.Google Scholar
Kiros, Ryan, Zhu, Yukun, Salakhutdinov, Ruslan R., et al. 2015. Skip-thought vectors. Pages 3294–3302 of: Advances in Neural Information Processing Systems.Google Scholar
Kjaergaard, Mikkel Baun, and Munk, Carsten Valdemar. 2008. Hyperbolic location fingerprinting: A calibration-free solution for handling differences in signal strength. Pages 110–116 of: Proceedings of the Sixth IEEE International Conference on Pervasive Computing and Communications.Google Scholar
Kober, Jens, Öztop, Erhan, and Peters, Jan. 2011. Reinforcement learning to adjust robot movements to new situations. Pages 2650–2655 of: Proceedings of the 22nd International Joint Conference on Artificial Intelligence.Google Scholar
Koch, Gregory. 2015. Siamese Neural Networks for One-Shot Image Recognition. M.Phil. thesis, University of Toronto.Google Scholar
Kolar, Mladen, Lafferty, John D., and Wasserman, Larry A. 2011. Union support recovery in multi-task learning. Journal of Machine Learning Research, 12, 24152435.Google Scholar
Koller, Daphne, and Friedman, Nir. 2009. Probabilistic Graphical Models: Principles and Techniques. MIT Press.Google Scholar
Kolodner, Janet. 1993. Case-Based Reasoning. Morgan Kaufmann.Google Scholar
Konidaris, George, and Barto, Andrew G. 2007. Building portable options: skill transfer in reinforcement learning. Pages 895–900 of: Proceedings of the 20th International Joint Conference on Artificial Intelligence.Google Scholar
Kotthoff, Lars, Thornton, Chris, Hoos, Holger H., Hutter, Frank, and Leyton-Brown, Kevin. 2017. Auto-WEKA 2.0: Automatic model selection and hyperparameter optimization in WEKA. Journal of Machine Learning Research, 18, 25:1–25:5.Google Scholar
Krallinger, Martin, and Valencia, Alfonso. 2005. Text-mining and information-retrieval services for molecular biology. Genome Biology, 6, 224.Google Scholar
Krishnan, P., Krishnakumar, A. S., Ju, Wen-Hua, Mallows, Colin, and Ganu, Sachin. 2004. A system for LEASE: Location estimation assisted by stationery emitters for indoor RF wireless networks. In: Proceedings of IEEE International Conference on Computer Communications.Google Scholar
Krizhevsky, Alex, and Hinton, Geoffrey. 2009. Learning Multiple Layers of Features from Tiny Images. Computer Science Department, University of Toronto, Technical Report.Google Scholar
Kulis, Brian, Saenko, Kate, and Darrell, Trevor. 2011. What you saw is not what you get: Domain adaptation using asymmetric kernel transforms. Pages 1785–1792 of: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Kullback, S., and Leibler, R. A. 1951. On information and sufficiency. Annals of Mathematical Statistics, 22(1), 7986.Google Scholar
Kumar, Abhishek, and Daumé III, Hal. 2012. Learning task grouping and overlap in multi-task learning. Proceedings of the 29th International Conference on Machine Learning.Google Scholar
Kumaraswamy, Raksha, Odom, Phillip, Kersting, Kristian, Leake, David, and Natarajan, Sri-raam. 2015. Transfer learning via relational type matching. Pages 811–816 of: Proceedings of IEEE International Conference on Data Mining.Google Scholar
Kuzborskij, Ilja, and Orabona, Francesco. 2013. Stability and hypothesis transfer learning. Pages 942–950 of: Proceedings of the 30th International Conference on Machine Learning.Google Scholar
Ladd, Andrew M., Bekris, Kostas E., Rudys, Algis, et al. 2002. Robotics-based location sensing using wireless Ethernet. Pages 227–238 of: Proceedings of the 8th Annual International Conference on Mobile Computing and Networking.Google Scholar
Lafferty, John D., and Zhai, ChengXiang. 2001. Document language models, query models, and risk minimization for information retrieval. Pages 111–119 of: Croft, W. Bruce, Harper, David J., Kraft, Donald H., and Zobel, Justin (eds.), SIGIR 2001: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.Google Scholar
Lai, Tze Leung, and Robbins, Herbert. 1985. Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6(1), 422.Google Scholar
Lake, B. M., Salakhutdinov, R., and Tenenbaum, J. B. 2015. Human-level concept learning through probabilistic program induction. Science, 350(6266), 13321338.Google Scholar
Lake, Brenden, Salakhutdinov, Ruslan, Gross, Jason, and Tenenbaum, Joshua. 2011. One shot learning of simple visual concepts. Pages 2568–2573 for: Proceedings of the Annual Meeting of the Cognitive Science Society.Google Scholar
Lake, Brenden M., Salakhutdinov, Ruslan, and Tenenbaum, Joshua B. 2013. One-shot learning by inverting a compositional causal process. Pages 2526–2534 of: Advances in Neural Information Processing Systems.Google Scholar
Laroche, Romain, and Barlier, Merwan. 2017. Transfer reinforcement learning with shared dynamics. Pages 2147–2153 of: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence.Google Scholar
Larrañaga, Pedro, Calvo, Borja, Santana, Roberto, et al. 2006. Machine learning in bioinformatics. Briefings in Bioinformatics, 7(1), 86112.Google Scholar
Lawrence, Neil D., and Platt, John C. 2004. Learning to learn with the informative vector machine. Proceedings of the Twenty-First International Conference on Machine Learning.Google Scholar
Lazaric, Alessandro. 2008. Knowledge Transfer in Reinforcement Learning. Ph.D. thesis, Politecnico di Milano.Google Scholar
Lazaric, Alessandro. 2012. Transfer in reinforcement learning: A framework and a survey. Pages 143–173 of: Wiering, Marco, and van Otterlo, Martijn (eds), Reinforcement Learning: State-of-the-Art.Google Scholar
Lazaric, Alessandro, and Ghavamzadeh, Mohammad. 2010. Bayesian multi-task reinforcement learning. Pages 599–606 of: Proceedings of the 27th International Conference on Machine Learning.Google Scholar
Lazaric, Alessandro, Restelli, Marcello, and Bonarini, Andrea. 2008. Transfer of samples in batch reinforcement learning. Pages 544–551 of: Proceedings of the Twenty-Fifth International Conference on Machine Learning.Google Scholar
Ledig, Christian, Theis, Lucas, Huszar, Ferenc, et al. 2017. Photo-realistic single image super-resolution using a generative adversarial network. Pages 4681–4690 of: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Lee, Byung-Jun, and Kim, Kee-Eung. 2016. Dialog history construction with long-short term memory for robust generative dialog state tracking. Dialogue & Discourse, 7(3), 4764.Google Scholar
Lee, Giwoong, Yang, Eunho, and Hwang, Sung Ju. 2016. Asymmetric multi-task learning based on task relatedness and loss. Pages 230–238 of: Proceedings of the 33rd International Conference on Machine Learning.Google Scholar
Lee, Honglak, Battle, Alexis, Raina, Rajat, and Ng, Andrew Y. 2007. Efficient sparse coding algorithms. Pages 801–808 of: Advances in Neural Information Processing Systems.Google Scholar
Lee, Jaewoo, and Kifer, Daniel. 2018. Concentrated differentially private gradient descent with adaptive per-iteration privacy budget. Pages 1656–1665 of: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Google Scholar
Lefèvre, Fabrice, Gašić, Milica, Jurčíček, F., et al. 2009. k-Nearest neighbor Monte-Carlo control algorithm for POMDP-based dialogue systems. Pages 272–275 of: Proceedings of the 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue.Google Scholar
Levin, Esther, Pieraccini, Roberto, and Eckert, Wieland. 1997. Learning dialogue strategies within the Markov decision process framework. Pages 72–79 of: Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding.Google Scholar
Li, Bin, Yang, Qiang, and Xue, Xiangyang. 2009a. Can movies and books collaborate? Cross-domain collaborative filtering for sparsity reduction. Pages 2052–2057 of: Proceedings of the 21st International Joint Conference on Artificial Intelligence.Google Scholar
Li, Bin, Yang, Qiang, and Xue, Xiangyang. 2009b. Transfer learning for collaborative filtering via a rating-matrix generative model. Pages 617–624 of: Proceedings of the 26th Annual International Conference on Machine Learning.Google Scholar
Li, Da, Yang, Yongxin, Song, Yi-Zhe, and Hospedales, Timothy M. 2017a. Deeper, broader and artier domain generalization. Pages 5543–5551 of: Proceedings of IEEE International Conference on Computer Vision.Google Scholar
Li, Fan, Yang, Yiming, and Xing, Eric P. 2005. From lasso regression to feature vector machine. Pages 779–786 of: Advances in Neural Information Processing Systems.Google Scholar
Li, Fangtao, Pan, Sinno Jialin, Jin, Ou, Yang, Qiang, and Zhu, Xiaoyan. 2012. Cross-domain co-extraction of sentiment and topic lexicons. Pages 410–419 for: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics.Google Scholar
Li, Fei-Fei, Fergus, Robert, and Perona, Pietro. 2006. One-shot learning of object categories. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(4), 594611.Google Scholar
Li, Hui, Liao, Xuejun, and Carin, Lawrence. 2009c. Multi-task reinforcement learning in partially observable stochastic environments. Journal of Machine Learning Research, 10, 11311186.Google Scholar
Li, Jiwei, Galley, Michel, Brockett, Chris, Spithourakis, et al. 2016. A persona-based neural conversation model. Pages 994–1003 for :Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.Google Scholar
Li, Lihong, Chu, Wei, Langford, John, and Schapire, Robert E. 2010. A contextual-bandit approach to personalized news article recommendation. Pages 661–670 of: Proceedings of the 19th International Conference on World Wide Web.Google Scholar
Li, Qi, and Ji, Heng. 2014. Incremental joint extraction of entity mentions and relations. Pages 402–412 of: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics.Google Scholar
Li, Sijin, Liu, Zhi-Qiang, and Chan, Antoni B. 2015. Heterogeneous multi-task learning for human pose estimation with deep convolutional neural network. International Journal of Computer Vision, 113(1), 1936.Google Scholar
Li, Wen, Duan, Lixin, Dong, Xu, and Tsang, Ivor W. 2014. Learning with augmented features for supervised and semi-supervised heterogeneous domain adaptation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(6), 11341148.Google Scholar
Li, Zheng, Zhang, Yu, Wei, Ying, Wu, Yuxiang, and Yang, Qiang. 2017b. End-to-end adversarial memory network for cross-domain sentiment classification. Pages 2237–2243 of: Proceedings of the International Joint Conference on Artificial Intelligence.Google Scholar
Liao, Renjie, Schwing, Alexander G., Zemel, Richard S., and Urtasun, Raquel. 2016. Learning deep parsimonious representations. Pages 5076–5084 of: Advances in Neural Information Processing Systems.Google Scholar
Liao, Xuejun, Xue, Ya, and Carin, Lawrence. 2005. Logistic regression with an auxiliary data source. Pages 505–512 of: Proceedings of the 22nd International Conference on Machine Learning.Google Scholar
Ling, Xiao, Xue, Gui-Rong, Dai, Wenyuan, et al. 2008. Can Chinese web pages be classified with English data source? Pages 969–978 of: Proceedings of the 17th International Conference on World Wide Web.Google Scholar
Liu, Bing. 2012. Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies, 5(1), 1167.Google Scholar
Liu, Bing, Hsu, Wynne, and Ma, Yiming. 1999. Mining association rules with multiple minimum supports. Pages 337–341 of: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Google Scholar
Liu, Bo, Wei, Ying Zhang, Yu and Yang, Qiang 2017. Deep neural networks for high dimension, low sample size data. Pages 2287–2293 of: Sierra, Carles (ed.), Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence.Google Scholar
Liu, Bo, Wei, Ying, Zhang, Yu, Yan, Zhixian, and Yang, Qiang. 2018. Transferable contextual bandit for cross-domain recommendation. Pages 3619–3626 of: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence.Google Scholar
Liu, Chenxi, Zoph, Barret, Neumann, Maxim, et al. 2018c. Progressive neural architecture search. Pages 19–35 of: Proceedings of 15th European Conference on Computer Vision.Google Scholar
Liu, Dong, Hua, Xian-Sheng, Yang, Linjun, Wang, Meng, and Zhang, Hong-Jiang. 2009a. Tag ranking. Pages 351–360 of: Proceedings of the 18th International Conference on World Wide Web.Google Scholar
Liu, Han, Palatucci, Mark, and Zhang, Jian. 2009b. Blockwise coordinate descent procedures for the multi-task lasso, with applications to neural semantic basis discovery. Pages 649–656 of: Proceedings of the 26th International Conference on Machine Learning.Google Scholar
Liu, Jiahui, Dolan, Peter, and Pedersen, Elin Rønby. 2010a. Personalized news recommendation based on click behavior. Pages 31–40 of: Proceedings of the 15th International Conference on Intelligent User Interfaces.Google Scholar
Liu, Qi, Xu, Qian, Zheng, Vincent W., et al. 2010b. Multi-task learning for cross-platform siRNA efficacy prediction: an in-silico study. BMC Bioinformatics, 11, 181.Google Scholar
Liu, Qiuhua, Liao, Xuejun, and Carin, Lawrence. 2007. Semi-supervised multitask learning. Pages 937–944 of: Advances in Neural Information Processing Systems.Google Scholar
Liu, Qiuhua, Liao, Xuejun, Li, Hui, Stack, Jason R., and Carin, Lawrence. 2009c. Semisupervised multitask learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(6), 10741086.Google Scholar
Liu, Wu, Mei, Tao, Zhang, Yongdong, Che, Cherry, and Luo, Jiebo. 2015a. Multi-task deep visual-semantic embedding for video thumbnail selection. Pages 3707–3715 of: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Liu, Xiaodong, Gao, Jianfeng, et al. 2015b. Representation learning using multi-task deep neural networks for semantic classification and information retrieval. Pages 912–921 of: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.Google Scholar
Long, Mingsheng, Wang, Jianmin, Ding, Guiguang, Shen, Dou, and Yang, Qiang. 2014. Transfer learning with graph co-regularization. IEEE Transactions on Knowledge and Data Engineering, 26(7), 18051818.Google Scholar
Long, Mingsheng, Cao, Yue, Wang, Jianmin, and Jordan, Michael I. 2015. Learning transferable features with deep adaptation networks. Pages 97–105 of: Proceedings of the 32nd International Conference on Machine Learning.Google Scholar
Long, Mingsheng, Zhu, Han, Wang, Jianmin, and Jordan, Michael I. 2017. Deep transfer learning with joint adaptation networks. Pages 2208–2217 of: Proceedings of International Conference on Machine Learning.Google Scholar
Lounici, Karim, Pontil, Massimiliano, Tsybakov, Alexandre B., and van de Geer, Sara A. 2009. Taking advantage of sparsity in multi-task learning. Proceedings of the 22nd Conference on Learning Theory.Google Scholar
Lozano, Aurelie C., and Swirszcz, Grzegorz. 2012. Multi-level lasso for sparse multi-task regression. Proceedings of the 29th International Conference on Machine Learning.Google Scholar
Lu, Guoyu, Yan, Yan, Ren, Li, et al. 2016. Where am I in the dark: Exploring active transfer learning on the use of indoor localization based on thermal imaging. Neurocomputing, 173, 8392.Google Scholar
Lugosi, Gábor, Papaspiliopoulos, Omiros, and Stoltz, Gilles. 2009. Online multi-task learning with hard constraints. Proceedings of the 22nd Conference on Learning Theory.Google Scholar
Luo, Bingfeng, Feng, Yansong, Xu, Jianbo, Zhang, Xiang, and Zhao, Dongyan. 2017. Learning to predict charges for criminal cases with legal basis. Pages 2727–2736 of: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.Google Scholar
Luong, Minh-Thang, Le, Quoc V., Sutskever, Ilya, Vinyals, Oriol, and Kaiser, Lukasz. 2016. Multi-task sequence to sequence learning. Proceedings of the 4th International Conference on Learning Representations.Google Scholar
Luria, Aleksandr R. 1976. Cognitive Development: Its Cultural and Social Foundations. Harvard University Press.Google Scholar
Ma, Zhigang, Yang, Yi, Nie, Feiping, et al. 2014. Harnessing lab knowledge for real-world action recognition. International Journal of Computer Vision, 109(1–2), 6073.Google Scholar
Mahadevan, Sridhar, and Maggioni, Mauro. 2007. Proto-value functions: A Laplacian framework for learning representation and control in Markov decision processes. Journal of Machine Learning Research, 8, 21692231.Google Scholar
Mahajan, Dhruv, Girshick, Ross B., Ramanathan, Vignesh, et al. 2018. Exploring the limits of weakly supervised pretraining. CoRR, abs/1805.00932.Google Scholar
Mahmud, M. M., and Ray, Sylvian R. 2007. Transfer learning using Kolmogorov complexity: Basic theory and empirical evaluations. Pages 985–992 of: Advances in Neural Information Processing Systems.Google Scholar
Mairesse, François, and Walker, Marilyn A. 2008. Trainable generation of big-five personality styles through data-driven parameter estimation. Pages 165–173 of: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics.Google Scholar
Mairesse, François, and Walker, Marilyn A. 2011. Controlling user perceptions of linguistic style: Trainable generation of personality traits. Computational Linguistics, 37(3), 455488.Google Scholar
Mairesse, François, Gašić, Milica, Jurcícek, Filip, et al. 2009. Spoken language understanding from unaligned data using discriminative classification models. Pages 4749–4752 of: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing.Google Scholar
Malaviya, Chaitanya, Neubig, Graham, and Littell, Patrick. 2017. Learning language representations for typology prediction. Pages 2529–2535 of: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.Google Scholar
Man, Tong, Shen, Huawei, Jin, Xiaolong, and Cheng, Xueqi. 2017. Cross-domain recommendation: An embedding and mapping approach. Pages 2464–2470 of: Proceedings of the 26th International Joint Conference on Artificial Intelligence.Google Scholar
Mansour, Yishay, Mohri, Mehryar, and Rostamizadeh, Afshin. 2008. Domain adaptation with multiple sources. Pages 1041–1048 of: Advances in Neural Information Processing Systems.Google Scholar
Mansour, Yishay, Mohri, Mehryar, and Rostamizadeh, Afshin. 2009. Domain adaptation: Learning bounds and algorithms. Proceedings of the 22nd Conference on Learning Theory.Google Scholar
Mao, Xiangbo, Lin, Binbin, Cai, Deng, He, Xiaofei, and Pei, Jian. 2013. Parallel field alignment for cross media retrieval. Pages 897–906 of: Proceedings of the 21st ACM International Conference on Multimedia.Google Scholar
Marx, Zvika, Rosenstein, Michael T., Dietterich, Thomas G., and Kaelbling, Leslie Pack. 2008. Two algorithms for transfer learning. Inductive Transfer: 10 Years Later.Google Scholar
Maurer, Andreas. 2005. Algorithmic stability and meta-learning. Journal of Machine Learning Research, 6, 967994.Google Scholar
Maurer, Andreas. 2006a. Bounds for linear multi-task learning. Journal of Machine Learning Research, 7, 117139.Google Scholar
Maurer, Andreas. 2006b. The Rademacher complexity of linear transformation classes. Pages 65–78 of: Proceedings of the 19th Annual Conference on Learning Theory.Google Scholar
Maurer, Andreas. 2009. Transfer bounds for linear feature learning. Machine Learning, 75(3), 327350.Google Scholar
Maurer, Andreas, Pontil, Massimiliano, and Romera-Paredes, Bernardino. 2013. Sparse coding for multitask and transfer learning. Pages 343–351 of: Proceedings of the 30th International Conference on Machine Learning.Google Scholar
Maurer, Andreas, Pontil, Massimiliano, and Romera-Paredes, Bernardino. 2016. The benefit of multitask representation learning. Journal of Machine Learning Research, 17, 132.Google Scholar
McAllester, David A. 1999. Some PAC-Bayesian theorems. Machine Learning, 37(3), 355363.Google Scholar
McCann, Bryan, Bradbury, James, Xiong, Caiming, and Socher, Richard. 2017. Learned in translation: Contextualized word vectors. Pages 6297–6308 of: Advances in Neural Information Processing Systems.Google Scholar
McGovern, Amy, and Barto, Andrew G. 2001. Automatic discovery of subgoals in reinforcement learning using diverse density. Pages 361–368 of: Proceedings of the Eighteenth International Conference on Machine Learning.Google Scholar
McNamara, Daniel, and Balcan, Maria-Florina. 2017. Risk bounds for transferring representations with and without fine-tuning. Pages 2373–2381 of: Proceedings of the 34th International Conference on Machine Learning.Google Scholar
Menze, Bjoern H., Jakab, András, Bauer, Stefan, et al. 2015. The multimodal brain tumor image segmentation benchmark (BRATS). IEEE Transactions on Medical Imaging, 34(10), 19932024.Google Scholar
Mihalkova, Lilyana, and Mooney, Raymond J. 2008. Transfer learning by mapping with minimal target data. In: Proceedings of the AAAI-08 Workshop on Transfer Learning for Complex Tasks.Google Scholar
Mihalkova, Lilyana, Huynh, Tuyen N., and Mooney, Raymond J. 2007. Mapping and revising Markov logic networks for transfer learning. Pages 608–614 of: Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence.Google Scholar
Mikolov, Tomas, Chen, Kai, Corrado, Greg, and Dean, Jeffrey. 2013a. Efficient estimation of word representations in vector space. CoRR, abs/1301.3781.Google Scholar
Mikolov, Tomas, Sutskever, Ilya, Chen, Kai, Corrado, Greg S., and Dean, Jeff. 2013b. Distributed representations of words and phrases and their compositionality. Pages 3111–3119 of: Advances in Neural Information Processing Systems.Google Scholar
Min, Sewon, Seo, Minjoon, and Hajishirzi, Hannaneh. 2017. Question answering through transfer learning from large fine-grained supervision data. Pages 510–517 of: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.Google Scholar
Misra, Ishan, Shrivastava, Abhinav, Gupta, Abhinav, and Hebert, Martial. 2016. Cross-stitch networks for multi-task learning. Pages 3994–4003 of: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Mitchell, T., Cohen, W., Hruschka, E., et al. 2015. Never-ending learning. Pages 2302–2310 of: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence.Google Scholar
Mitra, Pabitra, Murthy, C. A., and Pal, Sankar K. 2002. Unsupervised feature selection using feature similarity. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(3), 301312.Google Scholar
Mitzlaff, Folke, Atzmüller, Martin, Hotho, Andreas, and Stumme, Gerd. 2014. The social distributional hypothesis: A pragmatic proxy for homophily in online social networks. Social Network Analysis and Mining, 4(1), 216.Google Scholar
Mnih, Volodymyr, Kavukcuoglu, Koray, Silver, David, et al. 2013. Playing Atari with deep reinforcement learning. CoRR, abs/1312.5602.Google Scholar
Mnih, Volodymyr, Kavukcuoglu, Koray, Silver, David, et al. 2015. Human-level control through deep reinforcement learning. Nature, 518(7540), 529533.Google Scholar
Mo, Kaixiang, Zhang, Yu, Yang, Qiang, and Fung, Pascale. 2017. Fine grained knowledge transfer for personalized task-oriented dialogue systems. CoRR, abs/1711.04079.Google Scholar
Mo, Kaixiang, Zhang, Yu, Li, Shuangyin, Li, Jiajun, and Yang, Qiang. 2018. Personalizing a dialogue system with transfer reinforcement learning. Pages 5317–5324 of :Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence.Google Scholar
Moore, Andrew W. 1991. Variable resolution dynamic programming: Efficiently learning action maps in multivariate real-valued state-spaces. Pages 333–337 of: Proceedings of the Eighth International Conference on Machine Learning.Google Scholar
Mou, Lili, Meng, Zhao, Yan, Rui, et al. 2016. How transferable are neural networks in NLP applications? Pages 479–489 of: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.Google Scholar
Mrkšic, Nikola, Séaghdha, Diarmuid Ó., Thomson, Blaise, et al. 2015. Multi-domain dialog state tracking using recurrent neural networks. Pages 794–799 of: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics.Google Scholar
Nassar, Marcel, Abdallah, Rami, Zeineddine, Hady Ali, Yaacoub, Elias, and Dawy, Zaher. 2008. A new multitask learning method for multiorganism gene network estimation. Pages 2287–2291 of: Proceedings of IEEE International Symposium on Information Theory.Google Scholar
Ng, Andrew Y., Jordan, Michael I., and Weiss, Yair. 2002. On spectral clustering: Analysis and an algorithm. Pages 849–856 of: Advances in Neural Information Processing Systems.Google Scholar
Nguyen, Hien Van, Ho, Huy Tho, Patel, Vishal M., and Chellappa, Rama. 2015. DASH-N: Joint hierarchical domain adaptation and feature learning. IEEE Transactions on Image Processing, 24(12), 54795491.Google Scholar
Nguyen, Khanh, Hal Daumé, III, and Boyd-Graber, Jordan L. 2017. Reinforcement learning for bandit neural machine translation with simulated human feedback. Pages 1464–1474 of: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.Google Scholar
Ni, Jie, Qiu, Qiang, and Chellappa, Rama. 2013. Subspace interpolation via dictionary learning for unsupervised domain adaptation. Pages 692–699 of: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Ni, Lionel M., Liu, Yunhao, Lau, Yiu Cho, and Patil, Abhishek P. 2003. LANDMARC: Indoor location sensing using active RFID. Pages 407–415 of: Proceedings of IEEE International Conference on Pervasive Computing and Communications.Google Scholar
Nickel, Maximilian, Murphy, Kevin, Tresp, Volker, and Gabrilovich, Evgeniy. 2016. A review of relational machine learning for knowledge graphs. Proceedings of the IEEE, 104(1), 1133.Google Scholar
Niehues, Jan, and Cho, Eunah. 2017. Exploiting linguistic resources for neural machine translation using multi-task learning. Pages 80–89 of: Proceedings of the Second Conference on Machine Translation.Google Scholar
Norouzi, Mohammad, Mikolov, Tomas, Bengio, , Samy, et al. 2013. Zero-shot learning by convex combination of semantic embeddings. CoRR, abs/1312.5650.Google Scholar
Nowozin, Sebastian, Cseke, Botond, and Tomioka, Ryota. 2016. f-GAN: Training generative neural samplers using variational divergence minimization. Pages 271–279 of: Advances in Neural Information Processing Systems.Google Scholar
Obozinski, Guillaume, Taskar, Ben, and Jordan, Michael. 2006. Multi-task Feature Selection. Tech. Report, Department of Statistics, University of California, Berkeley.Google Scholar
Obozinski, Guillaume, Taskar, Ben, and Jordan, Michael. 2010. Joint covariate selection and joint subspace selection for multiple classification problems. Statistics and Computing, 20(2), 231252.Google Scholar
Obozinski, Guillaume, Wainwright, Martin J., and Jordan, Michael I. 2011. Support union recovery in high-dimensional multivariate regression. The Annals of Statistics, 39(1), 147.Google Scholar
Olshausen, Bruno A., and Field, David J. 1997. Sparse coding with an overcomplete basis set: A strategy employed by V1? Vision Research, 37(23), 33113325.Google Scholar
Oquab, Maxime, Bottou, Leon, Laptev, Ivan, and Sivic, Josef. 2014. Learning and transferring mid-level image representations using convolutional neural networks. Pages 1717–1724 of: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Palatucci, Mark, Pomerleau, Dean, Hinton, Geoffrey E., and Mitchell, Tom M. 2009. Zero-shot learning with semantic output codes. Pages 1410–1418 of: Advances in Neural Information Processing Systems.Google Scholar
Pan, Jialin. 2010. Feature-based Transfer Learning with Real-world Applications. Ph.D. thesis, Hong Kong University of Science and Technology.Google Scholar
Pan, Rong, Zhao, Junhui, Zheng, Vincent Wenchen, et al. 2007a. Domain-constrained semi-supervised mining of tracking models in sensor networks. Pages 1023–1027 of: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Google Scholar
Pan, Rong, Zhou, Yunhong, Cao, Bin, et al. 2008a. One-class collaborative filtering. Pages 502–511 of: Proceedings of the Eighth IEEE International Conference on Data Mining.Google Scholar
Pan, Sinno J., Kwok, James T., Yang, Qiang, and Pan, Jeffrey J. 2007b. Adaptive localization in a dynamic WiFi environment through multi-view learning. Pages 1108–1113 of: Proceedings of the 22nd National Conference on Artificial Intelligence.Google Scholar
Pan, Sinno Jialin. 2014. Transfer Learning. Pages 537–570 of: Data Classification: Algorithms and Applications. Chapman & Hall/CRC.Google Scholar
Pan, Sinno Jialin, and Yang, Qiang. 2010. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 13451359.Google Scholar
Pan, Sinno Jialin, Kwok, James T., and Yang, Qiang. 2008b. Transfer learning via dimensionality reduction. Pages 677–682 of: Proceedings of the 23rd AAAI Conference on Artificial Intelligence.Google Scholar
Pan, Sinno Jialin, Shen, Dou, Yang, Qiang, and Kwok, James T. 2008c. Transferring localization models across space. Pages 1383–1388 of: Proceedings of the 23rd AAAI Conference on Artificial Intelligence.Google Scholar
Pan, Sinno Jialin, Ni, Xiaochuan, Sun, Jian-Tao, Yang, Qiang, and Chen, Zheng. 2010a. Cross-domain sentiment classification via spectral feature alignment. Pages 751–760 of: Proceedings of the 19th International Conference on World Wide Web.Google Scholar
Pan, Sinno Jialin, Tsang, Ivor W., Kwok, James T., and Yang, Qiang. 2011. Domain adaptation via transfer component analysis. IEEE Transactions on Neural Networks, 22(2), 199210.Google Scholar
Pan, Weike, and Yang, Qiang. 2013. Transfer learning in heterogeneous collaborative filtering domains. Artificial Intelligence, 197, 3955.Google Scholar
Pan, Weike, Xiang, Evan W., Liu, Nathan N., and Yang, Qiang. 2010b. Transfer learning in collaborative filtering for sparsity reduction. Pages 230–235 of: Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence.Google Scholar
Pan, Weike, Xiang, Evan Wei, and Yang, Qiang. 2012. Transfer learning in collaborative filtering with uncertain ratings. Pages 662–668 of: Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence.Google Scholar
Pan, Weike, Liu, Zhuode, Ming, et al. 2015a. Compressed knowledge transfer via factorization machine for heterogeneous collaborative recommendation. Knowledge-Based Systems, 85, 234244.Google Scholar
Pan, Weike, Zhong, Hao, Xu, Congfu, and Ming, Zhong. 2015b. Adaptive Bayesian personalized ranking for heterogeneous implicit feedbacks. Knowledge-Based Systems, 73, 173180.Google Scholar
Pan, Weike, Liu, Mengsi, and Ming, Zhong. 2016a. Transfer learning for heterogeneous one-class collaborative filtering. IEEE Intelligent Systems, 31(4), 4349.Google Scholar
Pan, Weike, Yang, Qiang, Duan, Yuchao, and Ming, Zhong. 2016b. Transfer learning for semisupervised collaborative recommendation. ACM Transactions on Interactive Intelligent Systems, 6(2), 10:1–10:21.Google Scholar
Pan, Weike, Yang, Qiang, Duan, Yuchao, Tan, Ben, and Ming, Zhong. 2017. Transfer learning for behavior ranking. ACM Transactions on Intelligent Systems and Technology, 8(5), 65:1–65:23.Google Scholar
Pang, Bo, and Lee, Lillian. 2008. Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2(1–2), 1135.Google Scholar
Pang, Bo, Lee, Lillian, and Vaithyanathan, Shivakumar. 2002. Thumbs up? Sentiment classification using machine learning Techniques. Pages 79–86 of: Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing.Google Scholar
Pappas, Nikolaos, and Popescu-Belis, Andrei. 2017. Multilingual Hierarchical attention networks for document classification. Pages 1015–1025 of: Proceedings of the 8th International Joint Conference on Natural Language Processing.Google Scholar
Parameswaran, Shibin, and Weinberger, Kilian Q. 2010. Large margin multi-task metric learning. Pages 1867–1875 of: Advances in Neural Information Processing Systems.Google Scholar
Parisotto, Emilio, Ba, Jimmy, and Salakhutdinov, Ruslan. 2016. Actor-mimic: Deep multi-task and transfer reinforcement learning. Proceedings of the 4th International Conference on Learning Representations.Google Scholar
Patel, Vishal M., Gopalan, Raghuraman, Li, Ruonan, and Chellappa, Rama. 2015. Visual domain adaptation: A survey of recent advances. IEEE Signal Processing Magazine, 32(3), 5369.Google Scholar
Patterson, Donald J., Fox, Dieter, Kautz, Henry A., and Philipose, Matthai. 2005. Fine-grained activity recognition by aggregating abstract object usage. Pages 44–51 of: Proceedings of the Ninth IEEE International Symposium on Wearable Computers.Google Scholar
Pei, Zhongyi, Cao, Zhangjie, Long, Mingsheng, and Wang, Jianmin. 2018. Multi-adversarial domain adaptation. Pages 3934–3941 of: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence.Google Scholar
Peng, Hao, Thomson, Sam, and Smith, Noah A. 2017. Deep multitask learning for semantic dependency parsing. Pages 2037–2048 of: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.Google Scholar
Pennington, Jeffrey, Socher, Richard, and Manning, Christopher. 2014. Glove: Global vectors for word representation. Pages 1532–1543 of: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing.Google Scholar
Pentina, Anastasia, and Ben-David, Shai. 2015. multi-task and lifelong learning of kernels. Pages 194–208 of: Proceedings of the 26th International Conference on Algorithmic Learning Theory.Google Scholar
Pentina, Anastasia, and Lampert, Christoph H. 2014. A PAC-Bayesian bound for lifelong learning. Pages 991–999 of: Proceedings of the 31th International Conference on Machine Learning.Google Scholar
Pentina, Anastasia, and Lampert, Christoph H. 2015. Lifelong learning with non-i.i.d. tasks. Pages 1540–1548 of: Advances in Neural Information Processing Systems.Google Scholar
Perkins, Simon, Lacker, Kevin, and Theiler, James. 2003. Grafting: Fast, incremental feature selection by gradient descent in function space. Journal of Machine Learning Research, 3, 13331356.Google Scholar
Perrot, Michaël, and Habrard, Amaury. 2015. A theoretical analysis of metric hypothesis transfer learning. Pages 1708–1717 of: Proceedings of the 32nd International Conference on Machine Learning.Google Scholar
Phillips, Caitlin. 2006. Knowledge Transfer in Markov Decision Processes. Tech. reptort, McGill University.Google Scholar
Pillonetto, Gianluigi, Dinuzzo, Francesco, and Nicolao, Giuseppe De. 2010. Bayesian online multitask learning of Gaussian processes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(2), 193205.Google Scholar
Plis, Sergey M., Hjelm, Devon R., Salakhutdinov, Ruslan, and Calhoun, Vince D. 2013. Deep learning for neuroimaging: A validation study. CoRR, abs/1312.5847.Google Scholar
Pong, Ting Kei, Tseng, Paul, Ji, Shuiwang, and Ye, Jieping. 2010. Trace norm regularization: Reformulations, algorithms, and multi-task learning. SIAM Journal on Optimization, 20(6), 34653489.Google Scholar
Ponomareva, Natalia, and Thelwall, Mike. 2012. Biographies or blenders: Which resource is best for cross-domain sentiment analysis? Pages 488–499 of: Proceedings of International Conference on Intelligent Text Processing and Computational Linguistics.Google Scholar
Pontil, Massimiliano, and Maurer, Andreas. 2013. Excess risk bounds for multitask learning with trace norm regularization. Pages 55–76 of: Proceedings of the 26th Annual Conference on Learning Theory.Google Scholar
Pugh, K. J., and Bergin, D. A. 2006. Motivational influences on transfer. Educational Psychologist, 41, 147160.Google Scholar
Puniyani, Kriti, Kim, Seyoung, and Xing, Eric P. 2010. Multi-population GWA mapping via multi-task regularized regression. Bioinformatics, 26, i208–i216.Google Scholar
Qi, Guo-Jun, Aggarwal, Charu, and Huang, Thomas. 2011a. Towards semantic knowledge propagation from text corpus to web images. Pages 297–306 of: Proceedings of the 20th International Conference on World Wide Web.Google Scholar
Qi, Guo-Jun, Aggarwal, Charu, Rui, Yong, et al. 2011b. Towards cross-category knowledge propagation for learning visual concepts. Pages 897–904 of: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Qi, Yanjun, Tastan, Oznur, Carbonell, Jaime G., and Klein-Seetharaman, Judith. 2010. Semi-supervised multi-task learning for predicting interactions between HIV-1 and human proteins. Bioinformatics, 26, i645–i652.Google Scholar
Qiu, Qiang, Patel, Vishal M., Turaga, Pavan, and Chellappa, Rama. 2012. Domain adaptive dictionary learning. Pages 631–645 of: Proceedings of European Conference on Computer Vision.Google Scholar
Quionero-Candela, Joaquin, Sugiyama, Masashi, Schwaighofer, Anton, and Lawrence, Neil D. 2009. Dataset Shift in Machine Learning. MIT Press.Google Scholar
Radford, Alec, Metz, Luke, and Chintala, Soumith. 2015. Unsupervised representation learning with deep convolutional generative adversarial networks. CoRR, abs/1511.06434.Google Scholar
Raina, Rajat, Battle, Alexis, Lee, Honglak, Packer, Benjamin, and Ng, Andrew Y. 2007. Self-taught learning: Transfer learning from unlabeled data. Pages 759–766 of: Proceedings of the 24th International Conference on Machine Learning.Google Scholar
Raj, Anant, Namboodiri, Vinay P., and Tuytelaars, Tinne. 2015. Subspace alignment based domain adaptation for RCNN detector. Pages 166.1–166.11 of: Proceedings of the British Machine Vision Conference.Google Scholar
Rajpurkar, Pranav, Zhang, Jian, Lopyrev, Konstantin, and Liang, Percy. 2016. SQuAD: 100,000+ questions for machine comprehension of text. Pages 2383–2392 of: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.Google Scholar
Rajpurkar, Pranav, Irvin, Jeremy, Zhu, Kaylie, et al. 2017. CheXNet: Radiologist-level pneumonia detection on chest X-rays with deep learning. CoRR, abs/1711.05225.Google Scholar
Ranzato, Marc’Aurelio, Chopra, Sumit, Auli, Michael, and Zaremba, Wojciech. 2015. Sequence level training with recurrent neural networks. CoRR, abs/1511.06732.Google Scholar
Recanzone, Gregg H. 2009. Interactions of auditory and visual stimuli in space and time. Hearing Research, 258(1), 8999.Google Scholar
Reichart, Roi, Tomanek, Katrin, Hahn, Udo, and Rappoport, Ari. 2008. Multi-task active learning for linguistic annotations. Pages 861–869 of: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics.Google Scholar
Reiss, Attila, and Stricker, Didier. 2012. Introducing a new benchmarked dataset for activity monitoring. Pages 108–109 of: Proceedings of the 16th International Symposium on Wearable Computers.Google Scholar
Ren, Hang, Xu, Weiqun, and Yan, Yonghong. 2014. Markovian discriminative modeling for cross-domain dialog state tracking. Pages 342–347 of: Proceedings of IEEE Spoken Language Technology Workshop.Google Scholar
Resnick, Paul, and Varian, Hal R. 1997. Recommender systems. Communications of the ACM, 40(3), 5658.Google Scholar
Resnick, Paul, Iacovou, Neophytos, Suchak, Mitesh, Bergstrom, Peter, and Riedl, John. 1994. GroupLens: An open architecture for collaborative filtering of netnews. Pages 175–186 of: Proceedings of the 1994 ACM Conference on Computer Supported Cooperative Work.Google Scholar
Revow, Michael, Williams, Christopher K. I., and Hinton, Geoffrey E. 1996. Using generative models for handwritten digit recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(6), 592606.Google Scholar
Richards, Bradley L., and Mooney, Raymond J. 1992. Learning relations by pathfinding. Pages 50–55 of: Proceedings of the 10th National Conference on Artificial Intelligence.Google Scholar
Richardson, Matthew, and Domingos, Pedro. 2006. Markov logic networks. Machine Learning, 62(1–2), 107136.Google Scholar
Rohrbach, Marcus, Stark, Michael, Szarvas, György, Gurevych, Iryna, and Schiele, Bernt. 2010. What helps where – and why? Semantic relatedness for knowledge transfer. Pages 910–917 of: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Ruder, Sebastian, Bingel, Joachim, Augenstein, Isabelle, and Søgaard, Anders. 2017. Sluice networks: Learning what to share between loosely related tasks. CoRR, abs/1705.08142.Google Scholar
Rusu, Andrei A., Colmenarejo, Sergio Gomez, Gülçehre, Çaglar, et al. 2015. Policy distillation. CoRR, abs/1511.06295.Google Scholar
Ruvolo, Paul, and Eaton, Eric. 2013. ELLA: An efficient lifelong learning algorithm. Pages 507–515 of: Proceedings of the 30th International Conference on Machine Learning.Google Scholar
Saenko, Kate, Kulis, Brian, Fritz, Mario, and Darrell, Trevor. 2010. Adapting visual category models to new domains. Pages 213–226 of: Proceedings of European Conference on Computer Vision.Google Scholar
Saha, Avishek, Rai, Piyush, Hal Daumé, III, and Venkatasubramanian, Suresh. 2011. Online learning of multiple tasks and their relationships. Pages 643–651 of: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics.Google Scholar
Salimans, Tim, Goodfellow, Ian, Zaremba, Wojciech, et al. 2016. Improved techniques for training GANs. Pages 2234–2242 of: Advances in Neural Information Processing Systems.Google Scholar
Samala, Ravi K., Chan, Heang-Ping, Hadjiiski, Lubomir, et al. 2016. Mass detection in digital breast tomosynthesis: Deep convolutional neural network with transfer learning from mammography. Medical physics, 43(12), 66546666.Google Scholar
Schank, Roger C. 1983. Dynamic Memory – A Theory of Reminding and Learning in Computers and People. Cambridge University Press.Google Scholar
Scholkopf, Bernhard, and Smola, Alexander J. 2001. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press.Google Scholar
Schunk, D. 1965. Learning Theories: An Educational Perspective. Pearson.Google Scholar
Schwaighofer, Anton, Tresp, Volker, and Yu, Kai. 2005. Learning Gaussian process kernels via hierarchical Bayes. Pages 1209–1216 of: Advances in Neural Information Processing Systems.Google Scholar
Schweikert, Gabriele Beate, Widmer, Christian, Schölkopf, Bernhard, and Rätsch, Gunnar. 2008. An empirical analysis of domain adaptation algorithms for genomic sequence analysis. Pages 1433–1440 of: Advances in Neural Information Processing Systems.Google Scholar
Seo, Minjoon, Kembhavi, Aniruddha, Farhadi, Ali, and Hajishirzi, Hannaneh. 2016. Bidirectional attention flow for machine comprehension. CoRR, abs/1611.01603.Google Scholar
Serban, Iulian V., Sordoni, Alessandro, Bengio, Yoshua, Courville, Aaron, and Pineau, Joelle. 2015. Building end-to-end dialogue systems using generative hierarchical neural network models. arXiv preprint, arXiv:1507.04808.Google Scholar
Serban, Iulian V., Sordoni, Alessandro, Bengio, Yoshua, Courville, Aaron, and Pineau, Joelle. 2016. Building end-to-end dialogue systems using generative hierarchical neural network models. Pages 3776–3784 of: Proceedings of the 30th AAAI Conference on Artificial Intelligence.Google Scholar
Serban, Iulian Vlad, Sordoni, Alessandro, Lowe, Ryan, et al. 2017. A hierarchical latent variable encoder-decoder model for generating dialogues. Pages 3295–3301 of: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence.Google Scholar
Sermanet, Pierre, Eigen, David, Zhang, Xiang, et al. 2013. Overfeat: Integrated recognition, localization and detection using convolutional networks. CoRR, abs/1312.6229.Google Scholar
Sevakula, R. K., Singh, V., Verma, N. K., Kumar, C., and Cui, Y. 2018. Transfer learning for molecular cancer classification using deep neural networks. IEEE/ACM Transactions on Computational Biology and Bioinformatics.Google Scholar
Shang, Lifeng, Lu, Zhengdong, and Li, Hang. 2015. Neural responding machine for short-text conversation. Pages 1577–1586 of: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics.Google Scholar
Shekhar, Shashi, Patel, Vishal M., Nguyen, Hien, and Chellappa, Rama. 2013. Generalized domain-adaptive dictionaries. Pages 361–368 of: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Shen, Dou, Pan, Rong, Sun, Jian-Tao, et al. 2005. Q2 C@UST: Our winning solution to query classification in KDDCUP 2005. SIGKDD Explorations, 7(2), 100110.Google Scholar
Shen, Dou, Pan, Rong, Sun, Jian-Tao, et al. 2006a. Query enrichment for web-query classification. ACM Transactions on Information Systems, 24(3), 320352.Google Scholar
Shen, Dou, Sun, Jian-Tao, Yang, Qiang, and Chen, Zheng. 2006b. Building bridges for web query classification. Pages 131–138 of: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.Google Scholar
Sherstov, Alexander A., and Stone, Peter. 2005. Improving action selection in MDP’s via knowledge transfer. Pages 1024–1029 of: Proceedings of the Twentieth National Conference on Artificial Intelligence.Google Scholar
Shi, Xiaoxiao, Liu, Qi, Fan, Wei, Yang, Qiang, and Yu, Philip S. 2010a. Predictive modeling with heterogeneous sources. Pages 814–825 of: Proceedings of the SIAM International Conference on Data Mining.Google Scholar
Shi, Xiaoxiao, Liu, Qi, Fan, Wei, Yu, Philip S., and Zhu, Ruixin. 2010b. Transfer learning on heterogenous feature spaces via spectral transformation. Pages 1049–1054 of: Proceedings of the IEEE International Conference on Data Mining.Google Scholar
Shi, Xiaoxiao, Paiement, Jean-François, Grangier, David, and Yu, Philip S. 2012. Learning from heterogeneous sources via gradient boosting consensus. Pages 224–235 of: Proceedings of the SIAM International Conference on Data Mining.Google Scholar
Shi, Xiaoxiao, Liu, Qi, Fan, Wei, and Philip, S. Yu. 2013a. Transfer across completely different feature spaces via spectral embedding. IEEE Transactions on Knowledge and Data Engineering, 25(4), 906918.Google Scholar
Shi, Yangyang, Larson, Martha, and Jonker, Catholijn M. 2015. Recurrent neural network language model adaptation with curriculum learning. Computer Speech and Language, 33(1), 136154.Google Scholar
Shi, Yuan, and Sha, Fei. 2012. Information-theoretical learning of discriminative clusters for unsupervised domain adaptation. Pages 1275–1282 of: Proceedings of the 29th International Conference on Machine Learning.Google Scholar
Shi, Yue, Larson, Martha, and Hanjalic, Alan. 2013b. Mining contextual movie similarity with matrix factorization for context-aware recommendation. ACM Transactions on Intelligent Systems and Technology, 4(1), 16:1–16:19.Google Scholar
Shin, Hoo-Chang, Roth, Holger R., Gao, Mingchen, et al. 2016. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Transactions on Medical Imaging, 35(5), 12851298.Google Scholar
Shokri, Reza, and Shmatikov, Vitaly. 2015. Privacy-preserving deep learning. Pages 1310–1321 of: Proceedings of ACM Conference on Computer and Communications Security.Google Scholar
Shrivastava, Ashish, Pfister, Tomas, Tuzel, Oncel, et al. 2017. Learning from simulated and unsupervised images through adversarial training. Pages 2242–2251 of: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Shu, Xiangbo, Qi, Guo-Jun, Tang, Jinhui, and Wang, Jingdong. 2015. Weakly-shared deep transfer networks for heterogeneous-domain knowledge propagation. Pages 35–44 of: Proceedings of the 23rd ACM International Conference on Multimedia.Google Scholar
Si, Si, Tao, Dacheng, and Geng, Bo. 2010. Bregman divergence-based regularization for transfer subspace learning. IEEE Transactions on Knowledge and Data Engineering, 22(7), 929942.Google Scholar
Silver, Daniel L., and Mercer, Robert E. 1996. The parallel transfer of task knowledge using dynamic learning rates based on a measure of relatedness. Connection Science Special Issue: Transfer in Inductive Systems, 8(2), 277294.Google Scholar
Silver, Daniel L., Yang, Qiang, and Li, Lianghao. 2013. Lifelong machine learning systems: Beyond learning algorithms. Proceedings of the 2013 AAAI Spring Symposium on Lifelong Machine Learning, AAAI Technical Report, vol. SS-13-05.Google Scholar
Silver, David, Huang, Aja, Maddison, Chris J., et al. 2016. Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), 484489.Google Scholar
Singh, Ajit P., and Gordon, Geoffrey J. 2008. Relational learning via collective matrix factorization. Pages 650–658 of: Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Google Scholar
Singh, Satinder P., Kearns, Michael J., Litman, Diane J., and Walker, Marilyn A. 1999. Reinforcement learning for spoken dialogue systems. Pages 956–962 of: Advances in Neural Information Processing Systems.Google Scholar
Smola, Alexander J., and Schölkopf, Bernhard. 2004. A tutorial on support vector regression. Statistics and Computing, 14(3), 199222.Google Scholar
Smola, Alex, Gretton, Arthur, Song, Le, and Schölkopf, Bernhard. 2007a. A Hilbert space embedding for distributions. Pages 13–31 of: Proceedings of International Conference on Algorithmic Learning Theory.Google Scholar
Smola, Alexander J., Gretton, Arthur, Song, Le, and Schölkopf, Bernhard. 2007b. A Hilbert space embedding for distributions. Pages 40–41 of: Proceedings of the 10th International Conference on Discovery Science.Google Scholar
Snel, Matthijs, and Whiteson, Shimon. 2014. Learning potential functions and their representations for multi-task reinforcement learning. Autonomous Agents and Multi-Agent Systems, 28(4), 637681.Google Scholar
Socher, Richard, Ganjoo, Milind, Manning, Christopher D., and Ng, Andrew Y. 2013a. Zero-shot learning through cross-modal transfer. Pages 935–943 of: Advances in Neural Information Processing Systems.Google Scholar
Socher, Richard, Perelygin, Alex, Wu, Jean Y., et al. 2013b. Recursive deep models for semantic compositionality over a sentiment treebank. Pages 1631–1642 of: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing.Google Scholar
Søgaard, Anders, and Goldberg, Yoav. 2016. Deep multi-task learning with low level tasks supervised at lower layers. Pages 231–235 of: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.Google Scholar
Solnon, Matthieu, Arlot, Sylvain, and Bach, Francis R. 2012. Multi-task regression using minimal penalties. Journal of Machine Learning Research, 13, 27732812.Google Scholar
Song, Jinhua, Gao, Yang, Wang, Hao, and An, Bo. 2016. Measuring the distance between finite markov decision processes. Pages 468–476 of: Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems.Google Scholar
Sordoni, Alessandro, Bengio, Yoshua, Vahabi, Hossein, et al. 2015. A hierarchical recurrent encoder-decoder for generative context-aware query suggestion. Pages 553–562 of: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management.Google Scholar
Srivastava, Nitish, Hinton, Geoffrey E., Krizhevsky, Alex, Sutskever, Ilya, and Salakhutdinov, Ruslan. 2014. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(1), 19291958.Google Scholar
Sugiyama, Masashi, Nakajima, Shinichi, Kashima, Hisashi, von Bünau, Paul, and Kawanabe, Motoaki. 2008. Direct importance estimation with model selection and its application to covariate shift adaptation. Pages 1433–1440 of: Advances in Neural Information Processing Systems.Google Scholar
Suk, Heung-Il, and Shen, Dinggang. 2013. Deep learning-based feature representation for AD/MCI classification. Pages 583–590 of: Proceedings of the 16th International Conference on Medical Image Computing and Computer-Assisted Intervention.Google Scholar
Suk, Heung-Il, Lee, Seong-Whan, and Shen, Dinggang. 2014. Hierarchical feature representation and multimodal fusion with deep learning for AD/MCI diagnosis. NeuroImage, 101, 569582.Google Scholar
Sun, Kai, Xie, Qizhe, and Yu, Kai. 2016. Recurrent polynomial network for dialogue state tracking. Dialogue and Discourse, 7(3), 6588.Google Scholar
Sutskever, Ilya, Vinyals, Oriol, and Le, Quoc V. 2014. Sequence to sequence learning with neural networks. Pages 3104–3112 of: Advances in Neural Information Processing Systems.Google Scholar
Sutton, Richard S., and Barto, Andrew G. 1998. Reinforcement Learning – An Introduction. MIT Press.Google Scholar
Sutton, Richard S., Precup, Doina, and Singh, Satinder. 1999. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112(1–2), 181211.Google Scholar
Sweeney, Latanya. 2002. k-Anonymity: A model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 10(5), 557570.Google Scholar
Tai, Lei, Paolo, Giuseppe, and Liu, Ming. 2017. Virtual-to-real deep reinforcement learning: Continuous control of mobile robots for mapless navigation. Pages 31–36 of: Proceedings of 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems.Google Scholar
Tamada, Yoshinori, Bannai, Hideo, Kanehisa, Minoru, and Miyano, Satoru. 2005. Utilizing evolutionary information and gene expression data for estimating gene networks with Bayesian network models. Journal of Bioinformatics and Computational Biology, 3(6), 12951313.Google Scholar
Tan, Ben, Zhong, Erheng, Ng, Michael K., and Yang, Qiang. 2014. Mixed-transfer: Transfer learning over mixed graphs. Pages 208–216 of: Proceedings of the SIAM International Conference on Data Mining.Google Scholar
Tan, Ben, Song, Yangqiu, Zhong, Erheng, and Yang, Qiang. 2015. Transitive transfer learning. Pages 1155–1164 of: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Google Scholar
Tan, Ben, Zhang, Yu, Pan, Sinno Jialin, and Yang, Qiang. 2017. Distant domain transfer learning. Pages 2604–2610 of: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence.Google Scholar
Tang, Duyu, Qin, Bing, Feng, Xiaocheng, and Liu, Ting. 2015. Target-dependent sentiment classification with long short term memory. CoRR, abs/1512.01100.Google Scholar
Taylor, Matthew E., and Stone, Peter. 2005. Behavior transfer for value-function-based reinforcement learning. Pages 53–59 of: Proceedings of the 4th International Joint Conference on Autonomous Agents and Multiagent Systems.Google Scholar
Taylor, Matthew E., and Stone, Peter. 2007. Cross-domain transfer for reinforcement learning. Pages 879–886 of: Proceedings of the Twenty-Fourth International Conference on Machine Learning.Google Scholar
Taylor, Matthew E., and Stone, Peter. 2009. Transfer learning for reinforcement learning domains: A survey. Journal of Machine Learning Research, 10, 16331685.Google Scholar
Taylor, Matthew E., Stone, Peter, and Liu, Yaxin. 2005. Value functions for RL-based behavior transfer: A comparative study. Pages 880–885 of: Proceedings of the Twentieth National Conference on Artificial Intelligence and the Seventeenth Innovative Applications of Artificial Intelligence Conference.Google Scholar
Taylor, Matthew E., Whiteson, Shimon, and Stone, Peter. 2007. Transfer via inter-task mappings in policy search reinforcement learning. Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems.Google Scholar
Taylor, Matthew E., Jong, Nicholas K., and Stone, Peter. 2008a. Transferring instances for model-based reinforcement learning. Pages 488–505 of: Proceedings of European Conference on Machine Learning and Practice of Knowledge Discovery in Databases.Google Scholar
Taylor, Matthew E., Kuhlmann, Gregory, and Stone, Peter. 2008b. Autonomous transfer for reinforcement learning. Pages 283–290 of: Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems.Google Scholar
Tewari, Ambuj, Ravikumar, Pradeep K., and Dhillon, Inderjit S. 2011. Greedy algorithms for structurally constrained high dimensional problems. Pages 882–890 of: Advances in Neural Information Processing Systems.Google Scholar
Thomson, Blaise, and Young, Steve. 2010. Bayesian update of dialogue state: A POMDP framework for spoken dialogue systems. Computer Speech and Language, 24(4), 562588.Google Scholar
Thorndike, Edward. L., and S. Woodworth, R. 1901. The influence of improvement in one mental function upon the efficiency of other functions. II. The estimation of magnitudes. Psychological Review, 8(01), 384395.Google Scholar
Thrun, Sebastian. 1995. Explanation-Based Neural Network Learning a Lifelong Learning Approach. Ph.D. thesis, University of Bonn.Google Scholar
Thrun, Sebastian, and O’Sullivan, Joseph. 1996. Discovering structure in multiple learning tasks: The TC algorithm. Pages 489–497 of: Proceedings of the 13th International Conference on Machine Learning.Google Scholar
Tibshirani, Robert. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267288.Google Scholar
Toffler, Alvin. 1970. Future Shock. Random House.Google Scholar
Tommasi, Tatiana, Orabona, Francesco, and Caputo, Barbara. 2010. Safety in numbers: Learning categories from few examples with multi model knowledge transfer. Pages 3081–3088 of: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Tommasi, Tatiana, Orabona, Francesco, and Caputo, Barbara. 2014. Learning categories from few examples with multi model knowledge transfer. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(5), 928941.Google Scholar
Tompson, Jonathan, Stein, Murphy, Lecun, Yann, and Perlin, Ken. 2014. Real-time continuous pose recovery of human hands using convolutional networks. ACM Transactions on Graphics, 33(5), 169:1–169:10.Google Scholar
Topin, Nicholay, Haltmeyer, Nicholas, Squire, Shawn, et al. 2015. Portable option discovery for automated learning transfer in object-oriented Markov decision processes. Pages 3532–3536 of: Pages 3856–3864 of: Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence.Google Scholar
Toshniwal, Shubham, Tang, Hao, Lu, Liang, and Livescu, Karen. 2017. Multitask learning with low-level auxiliary tasks for encoder-decoded based speech recognition. Proceedings of the 18th Annual Conference of the International Speech Communication Association.Google Scholar
Tsuboi, Yuta, Kashima, Hisashi, Hido, Shohei, Bickel, Steffen, and Sugiyama, Masashi. 2009. Direct density ratio estimation for large-scale covariate shift adaptation. Journal of Information Processing, 17, 138155.Google Scholar
Tür, Gökhan. 2005. Model adaptation for spoken language understanding. Pages 41–44 of: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing.Google Scholar
Tür, Gökhan. 2006. Multitask learning for spoken language understanding. Pages 585– 588 of: Proceedings of IEEE International Conference on Acoustics Speech and Signal Processing.Google Scholar
Tzeng, Eric, Hoffman, Judy, Zhang, Ning, Saenko, Kate, and Darrell, Trevor. 2014. Deep domain confusion: Maximizing for domain invariance. CoRR, abs/1412.3474.Google Scholar
Tzeng, Eric, Hoffman, Judy, Darrell, Trevor, and Saenko, Kate. 2015. Simultaneous deep transfer across domains and tasks. Pages 4068–4076 of: Proceedings of IEEE International Conference on Computer Vision.Google Scholar
Tzeng, Eric, Hoffman, Judy, Saenko, Kate, and Darrell, Trevor. 2017. Adversarial discriminative domain adaptation. Pages 2962–2971 of: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Vail, Douglas L., Veloso, Manuela M., and Lafferty, John D. 2007. Conditional random fields for activity recognition. In: Proceedings of the Sixth International Joint Conference on Autonomous Agents and Multiagent Systems.Google Scholar
van Haaren, Jan, Kolobov, Andrey, and Davis, Jesse. 2015. TODTLER: Two-order-deep transfer learning. Pages 3007–3015 of: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence.Google Scholar
van Kasteren, Tim, Noulas, Athanasios K., Englebienne, Gwenn, and Kröse, Ben J. A. 2008. Accurate activity recognition in a home setting. Pages 1–9 of: Proceedings of the 10th International Conference on Ubiquitous Computing.Google Scholar
Vapnik, Vladimir. 1995. The Nature of Statistical Learning Theory. Springer.Google Scholar
Vapnik, Vladimir N. 1998. Statistical Learning Theory. Wiley-Interscience.Google Scholar
Venugopalan, Subhashini, Rohrbach, Marcus, Donahue, Jeffrey, et al. 2015a. Sequence to sequence – Video to text. Pages 4534–4542 of: Proceedings of the IEEE International Conference on Computer Vision.Google Scholar
Venugopalan, Subhashini, Xu, Huijuan, Donahue, Jeff, et al. 2015b. Translating videos to natural language using deep recurrent neural networks. Pages 1494–1504 of: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.Google Scholar
Vincent, Pascal, Larochelle, Hugo, Bengio, Yoshua, and Manzagol, Pierre-Antoine. 2008. Extracting and composing robust features with denoising autoencoders. Pages 1096–1103 of: Proceedings of the 25th International Conference on Machine Learning.Google Scholar
Vinyals, Oriol, Toshev, Alexander, Bengio, Samy, and Erhan, Dumitru. 2015. Show and tell: A neural image caption generator. Pages 3156–3164 of: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Vinyals, Oriol, Blundell, Charles, Lillicrap, Tim, Kavukcuoglu, Koray, and Wierstra, Daan. 2016. Matching networks for one shot learning. Pages 3630–3638 of: Advances in Neural Information Processing Systems.Google Scholar
Vondrick, Carl, Pirsiavash, Hamed, and Torralba, Antonio. 2016. Generating videos with scene dynamics. Pages 613–621 of: Advances In Neural Information Processing Systems.Google Scholar
Walker, Marilyn A., Fromer, Jeanne C., and Narayanan, Shrikanth. 1998. Learning optimal dialogue strategies: A case study of a spoken dialogue agent for email. Pages 1345–1351 of: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics.Google Scholar
Walker, Marilyn A., Stent, Amanda, Mairesse, François, and Prasad, Rashmi. 2007. Individual and domain adaptation in sentence planning for dialogue. Journal of Artificial Intelligence Research, 30, 413456.Google Scholar
Walsh, Thomas J., Li, Lihong, and Littman, Michael L. 2006. Transferring state abstractions between MDPS. Proceedings of ICML Workshop on Structural Knowledge Transfer for Machine Learning.Google Scholar
Wan, Xiang, Yang, Can, Yang, Qiang, et al. 2009. MegaSNPHunter: A learning approach to detect disease predisposition SNPs and high level interactions in genome wide association study. BMC Bioinformatics, 10, 13.Google Scholar
Wang, Boyu, and Pineau, Joelle. 2016. Generalized dictionary for multitask learning with boosting. Pages 2097–2103 of: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence.Google Scholar
Wang, Chang, and Mahadevan, Sridhar. 2009. Manifold alignment without correspondence. Pages 1273–1278 of: Proceedings of the 21st International Joint Conference on Artificial Intelligence.Google Scholar
Wang, Chang, and Mahadevan, Sridhar. 2011. Heterogeneous domain adaptation using manifold alignment. Pages 1541–1546 of: Proceedings of the 22nd International Joint Conference on Artificial Intelligence.Google Scholar
Wang, Daixin, Cui, Peng, and Zhu, Wenwu. 2018a. Deep asymmetric transfer network for unbalanced domain adaptation. Pages 443–450 of: Proceedings of the 32th AAAI Conference on Artificial Intelligence.Google Scholar
Wang, Hua, Huang, Heng, Nie, Feiping, and Ding, Chris. 2011. Cross-language web page classification via dual knowledge transfer using nonnegative matrix tri-factorization. Pages 933–942 of: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval.Google Scholar
Wang, Hua-Yan, and Yang, Qiang. 2011. Transfer learning by structural analogy. Pages 513–518 of: Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence.Google Scholar
Wang, Hua-Yan, Zheng, Vincent Wenchen, Zhao, Junhui, and Yang, Qiang. 2010. Indoor localization in multi-floor environments with reduced effort. Pages 244–252 of: Proceedings of the 8th Annual IEEE International Conference on Pervasive Computing and Communications.Google Scholar
Wang, Jialei, Kolar, Mladen, and Srebro, Nathan. 2016a. Distributed multi-task learning. Pages 751–760 of: Proceedings of the 19th International Conference on Artificial Intelligence and Statistics.Google Scholar
Wang, Jindong, Chen, Yiqiang, Hao, Shuji, Peng, Xiaohui, and Hu, Lisha. 2017a. Deep learning for sensor-based activity recognition: A survey. CoRR, abs/1707.03502.Google Scholar
Wang, Jindong, Chen, Yiqiang, Hu, Lisha, Peng, Xiaohui, and Yu, Philip S. 2018b. Stratified transfer learning for cross-domain activity recognition. CoRR, abs/1801.00820.Google Scholar
Wang, Sheng, Li, Zhen, Yu, Yizhou, and Xu, Jinbo. 2017b. Folding membrane proteins by deep transfer learning. CoRR, abs/1708.08407.Google Scholar
Wang, Shenlong, Zhang, Lei, Liang, Yan, and Pan, Quan. 2012. Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis. Pages 2216–2223 of: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Wang, Shuai, Chen, Zhiyuan, and Liu, Bing. 2016b. Mining aspect-specific opinion using a holistic lifelong topic model. Pages 167–176 of: Proceedings of the 25th International Conference on World Wide Web.Google Scholar
Wang, Shuohang, Yu, Mo, Guo, Xiaoxiao, et al. 2018c. R3 : Reinforced ranker-reader for open-domain question answering. Pages 5981–5988 of: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence.Google Scholar
Wang, Sida, and Manning, Christopher D. 2012. Baselines and bigrams: Simple, good sentiment and topic classification. Pages 90–94 of: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics.Google Scholar
Wang, Wenhui, Yang, Nan, Wei, Furu, Chang, Baobao, and Zhou, Ming. 2017c. Gated self-matching networks for reading comprehension and question answering. Pages 189– 198 of: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.Google Scholar
Wang, Xin, Bi, Jinbo, Yu, Shipeng, and Sun, Jiangwen. 2014. On multiplicative multitask feature learning. Pages 2411–2419 of: Advances in Neural Information Processing Systems.Google Scholar
Wang, Xuezhi, and Schneider, Jeff G. 2015. Generalization bounds for transfer learning under model shift. Pages 922–931 of: Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence.Google Scholar
Wang, Yang, Gu, Quanquan, and Brown, Donald E. 2018d. Differentially private hypothesis transfer learning. Pages 811–826 of: Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases.Google Scholar
Wang, Zhuoran, and Lemon, Oliver. 2013. A simple and generic belief tracking mechanism for the dialog state tracking challenge: On the believability of observed information. Pages 423–432 of: Proceedings of the 14th Annual Meeting of the Special Interest Group on Discourse and Dialogue.Google Scholar
Wei, Ying, Zhu, Yin, Leung, Cane Wing-ki, Song, Yangqiu, and Yang, Qiang. 2016a. Instilling social to physical: Co-regularized heterogeneous transfer learning. Pages 1338–1344 of: Proceedings of the 30th AAAI Conference on Artificial Intelligence.Google Scholar
Wei, Ying, Zheng, Yu, and Yang, Qiang. 2016b. Transfer knowledge between cities. Pages 1905–1914 of: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Google Scholar
Wei, Ying, Zhang, Yu, Huang, Junzhou, and Yang, Qiang. 2018. Transfer learning via learning to transfer. Pages 5072–5081 of: Proceedings of the 35th International Conference on Machine Learning.Google Scholar
Weinberger, Kilian Q., Sha, Fei, and Saul, Lawrence K. 2004. Learning a kernel matrix for nonlinear dimensionality reduction. Proceedings of the Twenty-First International Conference on Machine Learning.Google Scholar
Wen, Tsung-Hsien, Heidel, Aaron, Lee, Hung-yi, Tsao, Yu, and Lee, Lin-Shan. 2013. Recurrent neural network based language model personalization by social network crowd-sourcing. Pages 2703–2707 of: Proceedings of the 14th Annual Conference of the International Speech Communication Association.Google Scholar
Wen, Tsung-Hsien, Gašić, Milica, Mrkšic, Nikola, et al. 2015a. Semantically conditioned LSTM-based natural language generation for spoken dialogue systems. Pages 1711–1721 of: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.Google Scholar
Wen, Tsung-Hsien, Gašić, Milica, Mrkšic, Nikola, et al. 2015b. Toward multi-domain language generation using recurrent neural networks. NIPS Workshop on ML for SLU and Interaction.Google Scholar
Wen, Tsung-Hsien, Gašić, Milica, Mrkšic, Nikola, et al. 2016. Multi-domain neural network language generation for spoken dialogue systems. Pages 120–129 of: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.Google Scholar
Widmer, Christian, Leiva, Jose, Altun, Yasemin, and Rätsch, Gunnar. 2010a. Leveraging sequence classification by taxonomy-based multitask learning. Pages 522–534 of: Proceedings of 14th the Annual International Conference on Research in Computational Molecular Biology.Google Scholar
Widmer, Christian, Toussaint, Nora C., Altun, Yasemin, Kohlbacher, Oliver, and Rätsch, Gunnar. 2010b. Novel machine learning methods for MHC Class I binding prediction. Pages 98–109 of: Proceedings of the 5th IAPR International Conference on Pattern Recognition in Bioinformatics.Google Scholar
Widmer, Christian, Toussaint, Nora C., Altun, Yasemin, and Rätsch, Gunnar. 2010c. Inferring latent task structure for multitask learning by multiple kernel learning. BMC Bioinformatics, 1(Suppl. 8), 55.Google Scholar
Williams, Jason. 2013. Multi-domain learning and generalization in dialog state tracking. Pages 433–441 of: Proceedings of the 14th Annual Meeting of the Special Interest Group on Discourse and Dialogue.Google Scholar
Williams, Jason D. 2008a. The best of both worlds: Unifying conventional dialog systems and POMDPs. Pages 1173–1176 of: Proceedings of the 9th Annual Conference of the International Speech Communication Association.Google Scholar
Williams, Jason D. 2008b. Integrating expert knowledge into POMDP optimization for spoken dialog systems. Proceedings of the AAAI Workshop on Advancements in POMDP Solvers.Google Scholar
Wilson, Aaron, Fern, Alan, Ray, Soumya, and Tadepalli, Prasad. 2007. Multi-task reinforcement learning: A hierarchical Bayesian approach. Pages 1015–1022 of: Proceedings of the Twenty-Fourth International Conference on Machine Learning.Google Scholar
Winston, Patrick H. 1980. Learning and reasoning by analogy. Communications of the ACM, 23(12), 689703.Google Scholar
Wong, Catherine, Houlsby, Neil, Lu, Yifeng, and Gesmundo, Andrea. 2018. Transfer learning with neural AutoML. Pages 8366–8375 of: Advances in Neural Information Processing Systems 31.Google Scholar
Wood, Erroll, Baltrušaitis, Tadas, Morency, Louis-Philippe, Robinson, Peter, and Bulling, Andreas. 2016. Learning an appearance-based gaze estimator from one million synthesised images. Pages 131–138 of: Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research and Applications.Google Scholar
Wu, Pengcheng, and Dietterich, Thomas G. 2004. Improving SVM accuracy by training on auxiliary data sources. Pages 111–117 of: Proceedings of the 21st International Conference on Machine Learning.Google Scholar
Wu, Shuangzhi, Zhang, Dongdong, Yang, Nan, Li, Mu, and Zhou, Ming. 2017. Sequence-to-dependency neural machine translation. Pages 698–707 of: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.Google Scholar
Wu, Xinxiao, Wang, Han, Liu, Cuiwei, and Jia, Yunde. 2013. Cross-view action recognition over heterogeneous feature spaces. Pages 609–616 of: Proceedings of the IEEE International Conference on Computer Vision.Google Scholar
Xie, Liyang, Baytas, Inci M., Lin, Kaixiang, and Zhou, Jiayu. 2017. Privacy-preserving distributed multi-task learning with asynchronous updates. Pages 1195–1204 of: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Google Scholar
Xie, Michael, Jean, Neal, Burke, Marshall, Lobell, David, and Ermon, Stefano. 2016. Transfer learning from deep features for remote sensing and poverty mapping. Pages 3929–3935 of: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence.Google Scholar
Xing, Eric P., Jordan, Michael I., and Karp, Richard M. 2001. Feature selection for high-dimensional genomic microarray data. Pages 601–608 of: Proceedings of the 8th International Conference on Machine Learning.Google Scholar
Xu, Jiaolong, Ramos, Sebastian, Vázquez, David, and López, Antonio M. 2014a. Domain adaptation of deformable part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(12), 23672380.Google Scholar
Xu, Kelvin, Ba, Jimmy, Kiros, Ryan, et al. 2015. Show, attend and tell: Neural image caption generation with visual attention. Pages 2048–2057 of: Proceedings of the 32nd International Conference on Machine Learning.Google Scholar
Xu, Qian, and Yang, Qiang. 2011. A survey of transfer and multitask learning in bioinformatics. Journal of Computing Science and Engineering, 5(3), 257268.Google Scholar
Xu, Qian, Xiang, Evan Wei, and Yang, Qiang. 2010. Protein–protein interaction prediction via collective matrix factorization. Pages 62–67 of: Proceedings of IEEE International Conference on Bioinformatics and Biomedicine.Google Scholar
Xu, Qian, Pan, Sinno Jialin, Xue, Hannah Hong, and Yang, Qiang. 2011. Multitask learning for protein subcellular location prediction. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 8(3), 748759.Google Scholar
Xu, Yonghui, Pan, Sinno Jialin, Xiong, Hui, Wu, et al. 2017. A unified framework for metric transfer learning. IEEE Transactions on Knowledge and Data Engineering, 29(6), 11581171.Google Scholar
Xu, Zheng, Li, Wen, Niu, Li, and Xu, Dong. 2014b. Exploiting low-rank structure from latent domains for domain generalization. Pages 628–643 of: Proceedings of the 13th European Conference on Computer Vision.Google Scholar
Xu, Zhixiang, Huang, Gao, Weinberger, Kilian Q., and Zheng, Alice X. 2014c. Gradient boosted feature selection. Pages 522–531 of: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM.Google Scholar
Xue, Ya, Liao, Xuejun, Carin, Lawrence, and Krishnapuram, Balaji. 2007. Multi-task learning for classification with Dirichlet process priors. Journal of Machine Learning Research, 8, 3563.Google Scholar
Yamada, Makoto, Jitkrittum, Wittawat, Sigal, Leonid, Xing, Eric P., and Sugiyama, Masashi. 2014. High-dimensional feature selection by feature-wise kernelized lasso. Neural Computation, 26(1), 185207.Google Scholar
Yang, Bishan, and Mitchell, Tom. 2017. A joint sequential and relational model for frame-semantic parsing. Pages 1247–1256 of: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.Google Scholar
Yang, Can, He, Zengyou, Wan, Xiang, et al. 2008. SNPHarvester: A filtering-based approach for detecting epistatic interactions in genome-wide association studies. Bioinformatics, 25(4), 504511.Google Scholar
Yang, Jian, Zhang, David, Yang, Jing-Yu, and Niu, Ben. 2007a. Globally maximizing, locally minimizing: Unsupervised discriminant projection with applications to face and palm biometrics. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(4), 650664.Google Scholar
Yang, Jianchao, Wright, John, Huang, Thomas S., and Ma, Yi. 2010. Image super-resolution via sparse representation. IEEE Transactions on Image Processing, 19(11), 28612873.Google Scholar
Yang, Jun, Yan, Rong, and Hauptmann, Alexander G. 2007b. Adapting SVM classifiers to data with shifted distributions. Pages 69–76 of: Workshops Proceedings of the 7th IEEE International Conference on Data Mining.Google Scholar
Yang, Jun, Yan, Rong, and Hauptmann, Alexander G. 2007c. Cross-domain video concept detection using adaptive SVMs. Pages 188–197 of: Proceedings of the 15th ACM International Conference on Multimedia.Google Scholar
Yang, Liu, Hanneke, Steve, and Carbonell, Jaime G. 2013. A theory of transfer learning with applications to active learning. Machine Learning, 90(2), 161189.Google Scholar
Yang, Min, Zhao, Zhou, Zhao, Wei, et al. 2017. Personalized response generation via domain adaptation. Pages 1021–1024 of: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval.Google Scholar
Yang, Qiang, Chen, Yuqiang, Xue, Gui-Rong, Dai, Wenyuan, and Yu, Yong. 2009. Heterogeneous transfer learning for image clustering via the social web. Pages 1–9 of: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP.Google Scholar
Yang, Wen Hui, Dai, Dao Qing, and Yan, Hong. 2011. Finding correlated biclusters from gene expression data. IEEE Transaction on Knowledge and Data Engineering, 23(4), 568584.Google Scholar
Yang, Zhilin, Salakhutdinov, Ruslan, and Cohen, William. 2016. Multi-task cross-lingual sequence tagging from scratch. CoRR, abs/1603.06270.Google Scholar
Yao, Kaisheng, Zweig, Geoffrey, Hwang, Mei-Yuh, Shi, Yangyang, and Yu, Dong. 2013. Recurrent neural networks for language understanding. Pages 2524–2528 of: Proceedings of the 14th Annual Conference of the International Speech Communication Association.Google Scholar
Yao, Kaisheng, Peng, Baolin, Zhang, Yu, et al. 2014. Spoken language understanding using long short-term memory neural networks. Pages 189–194 of: Proceedings of IEEE Spoken Language Technology Workshop.Google Scholar
Yao, Quanming, Wang, Mengshuo, Escalante, Hugo Jair, et al. 2018. Taking human out of learning applications: A survey on automated machine learning. CoRR, abs/1810.13306.Google Scholar
Yazdani, Majid, and Henderson, James. 2015. A model of zero-shot learning of spoken language understanding. Pages 244–249 of: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.Google Scholar
Ye, Jihang, Cheng, Hong, Zhu, Zhe, and Chen, Minghua. 2013. Predicting positive and negative links in signed social networks by transfer learning. Pages 1477–1488 of: Proceedings of the 22nd International Conference on World Wide Web.Google Scholar
Yi, Zili, Zhang, Hao, Tan, Ping, and Gong, Minglun. 2017. DualGAN: Unsupervised dual learning for image-to-image translation. Pages 2849–2857 of: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Yin, Haiyan, and Pan, Sinno Jialin. 2017. Knowledge transfer for deep reinforcement learning with hierarchical experience replay. Pages 1640–1646 of: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence.Google Scholar
Yin, Jie, Yang, Qiang, and Ni, Lionel M. 2005. Adaptive temporal radio maps for indoor location estimation. Pages 85–94 of: Proceedings of the 3rd IEEE International Conference on Pervasive Computing and Communications.Google Scholar
Yosinski, Jason, Clune, Jeff, Bengio, Yoshua, and Lipson, Hod. 2014. How transferable are features in deep neural networks? Pages 3320–3328 of: Advances in Neural Information Processing Systems.Google Scholar
Young, Steve, Gašić, Milica, Keizer, Simon, Mairesse, et al. 2010. The hidden information state model: A practical framework for POMDP-based spoken dialogue management. Computer Speech and Language, 24(2), 150174.Google Scholar
Young, Steve, Gašić, Milica, Thomson, Blaise, and Williams, Jason D. 2013. POMDP-based statistical spoken dialog systems: A review. Proceedings of the IEEE, 101(5), 11601179.Google Scholar
Yu, Lantao, Zhang, Weinan, Wang, Jun, and Yu, Yong. 2017. SeqGAN: Sequence generative adversarial nets with policy gradient. Pages 2852–2858 of: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence.Google Scholar
Yu, Zhou, Wu, Fei, Yang, Yi, et al. 2014. Discriminative coupled dictionary hashing for fast cross-media retrieval. Pages 395–404 of: Proceedings of the 37th International ACM SI-GIR Conference on Research and Development in Information Retrieval.Google Scholar
Zadrozny, Bianca. 2004. Learning and evaluating classifiers under sample selection bias. Proceedings of the Twenty-First International Conference on Machine Learning.Google Scholar
Zhang, Chao, Zhang, Lei, and Ye, Jieping. 2012. Generalization bounds for domain adaptation. Advances in Neural Information Processing Systems.Google Scholar
Zhang, Duo, Mei, Qiaozhu, and Zhai, Chengxiang. 2010a. Cross-lingual latent topic extraction. Pages 1128–1137 of: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics.Google Scholar
Zhang, Jing, Ding, Zewei, Li, Wanqing, and Ogunbona, Philip. 2018. Importance weighted adversarial nets for partial domain adaptation. Pages 8156–8164 of: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Zhang, Jingwei, Springenberg, Jost Tobias, Boedecker, Joschka, and Burgard, Wolfram. 2017a. Deep reinforcement learning with successor features for navigation across similar environments. Pages 2371–2378 of: Proceedings of 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems.Google Scholar
Zhang, Jintao, and Huan, Jun. 2012. Inductive multi-task learning with multiple view data. Pages 543–551 of: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Google Scholar
Zhang, Kai, Gray, Joe W., and Parvin, Bahram. 2010b. Sparse multitask regression for identifying common mechanism of response to therapeutic targets. Bioinformatics, 26, i97–i105.Google Scholar
Zhang, Kai, Zheng, Vincent W., Wang, Qiaojun, et al. 2013. Covariate shift in Hilbert space: A solution via sorrogate kernels. Pages 388–395 of: Proceedings of the 30th International Conference on Machine Learning.Google Scholar
Zhang, Lei, Zuo, Wangmeng, and Zhang, David. 2016. LSDT: Latent sparse domain transfer learning for visual adaptation. IEEE Transactions on Image Processing, 25(3), 11771191.Google Scholar
Zhang, Tong. 2002. Covering number bounds for certain regularized linear function classes. Journal of Machine Learning Research, 2, 527550.Google Scholar
Zhang, Weinan, Liu, Ting, Wang, Yifa, and Zhu, Qingfu. 2017b. Neural personalized response generation as domain adaptation. CoRR, abs/1701.02073.Google Scholar
Zhang, Wenlu, Li, Rongjian, Zeng, Tao, Sun, et al. 2015a. Deep model based transfer and multi-task learning for biological image analysis. Pages 1475–1484 of: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Google Scholar
Zhang, Wenlu, Li, Rongjian, Zeng, Tao, et al. 2017c. Deep model based transfer and multi-task learning for biological image analysis. IEEE Transactions on Big Data.Google Scholar
Zhang, Xiao-Lei. 2015a. Convex discriminative multitask clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(1), 2840.Google Scholar
Zhang, Xucong, Sugano, Yusuke, Fritz, Mario, and Bulling, Andreas. 2015b. Appearance-based gaze estimation in the wild. Pages 4511–4520 of: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Zhang, Yi, and Schneider, Jeff G. 2010. Learning multiple tasks with a sparse matrix-normal penalty. Pages 2550–2558 of: Advances in Neural Information Processing Systems.Google Scholar
Zhang, Yongfeng, Ai, Qingyao, Chen, Xu, and Croft, W. Bruce. 2017d. Joint representation learning for top-N recommendation with heterogenous information sources. Pages 1449–1458 of: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management.Google Scholar
Zhang, Yu. 2013. Heterogeneous-neighborhood-based multi-task local learning algorithms. Pages 1896–1904 of: Advances in Neural Information Processing Systems.Google Scholar
Zhang, Yu. 2015b. Multi-task learning and algorithmic stability. Pages 3181–3187 of: Proceedings of the 29th AAAI Conference on Artificial Intelligence.Google Scholar
Zhang, Yu. 2015c. Parallel multi-task learning. Pages 629–638 of: Proceedings of the IEEE International Conference on Data Mining.Google Scholar
Zhang, Yu, and Yang, Qiang. 2017a. Learning sparse task relations in multi-task learning. Pages 2914–2920 of: Proceedings of the 31st AAAI Conference on Artificial Intelligence.Google Scholar
Zhang, Yu, and Yang, Qiang. 2017b. A survey on multi-task learning. CoRR, abs/1707.08114v2.Google Scholar
Zhang, Yu, and Yeung, Dit-Yan. 2009. Semi-supervised multi-task regression. Pages 617–631 of: Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases.Google Scholar
Zhang, Yu, and Yeung, Dit-Yan. 2010a. A convex formulation for learning task relationships in multi-task learning. Pages 733–742 of: Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence.Google Scholar
Zhang, Yu, and Yeung, Dit-Yan. 2010b. Multi-task learning using generalized t process. Pages 964–971 of: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics.Google Scholar
Zhang, Yu, and Yeung, Dit-Yan. 2012. Multi-task boosting by exploiting task relationships. Pages 697–710 of: Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Dtabases.Google Scholar
Zhang, Yu, and Yeung, Dit-Yan. 2013a. Learning high-order task relationships in multi-task learning. Pages 1917–1923 of: Proceedings of the 23rd International Joint Conference on Artificial Intelligence.Google Scholar
Zhang, Yu, and Yeung, Dit-Yan. 2013b. Multilabel relationship learning. ACM Transactions on Knowledge Discovery from Data, 7(2), article 7.Google Scholar
Zhang, Yu, and Yeung, Dit-Yan. 2014. A regularization approach to learning task relationships in multitask learning. ACM Transactions on Knowledge Discovery from Data, 8(3), article 12.Google Scholar
Zhang, Yu, Yeung, Dit-Yan, and Xu, Qian. 2010c. Probabilistic multi-task feature selection. Pages 2559–2567 of: Advances in Neural Information Processing Systems.Google Scholar
Zhang, Zhanpeng, Luo, Ping, Loy, Chen Change, and Tang, Xiaoou. 2014. Facial landmark detection by deep multi-task learning. Pages 94–108 of: Proceedings of the 13th European Conference on Computer Vision.Google Scholar
Zhao, Junbo Jake, Mathieu, Michaël, and LeCun, Yann. 2016. Energy-based generative adversarial network. CoRR, abs/1609.03126.Google Scholar
Zhao, Kai, and Huang, Liang. 2017. Joint syntacto-discourse parsing and syntacto-discourse treebank. Pages 2117–2123 of: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.Google Scholar
Zhao, Xiangyu, Zhang, Liang, Ding, Zhuoye, et al. 2018. Deep reinforcement learning for list-wise recommendations. CoRR, abs/1801.00209.Google Scholar
Zheng, Vincent W., Pan, Sinno J., Yang, Qiang, and Pan, Jeffrey J. 2008a. Transferring multi-device localization models using latent multi-task learning. Pages 1427–1432 of: Proceedings of the 23rd AAAI Conference on Artificial Intelligence.Google Scholar
Zheng, Vincent W., Xiang, Evan Wei, Yang, Qiang, and Shen, Dou. 2008b. Transferring localization models over time. Pages 1421–1426 of: Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence.Google Scholar
Zheng, Vincent W., Cao, Hong, Gao, Shenghua, et al. 2016. Cold-start heterogenous-device wireless localization. Pages 1429–1435 of: Proceedings of the 30th AAAI Conference on Artificial Intelligence.Google Scholar
Zheng, Vincent Wenchen, Hu, Derek Hao, and Yang, Qiang. 2009. Cross-domain activity recognition. Pages 61–70 of: Proceedings of the 11th International Conference on Ubiquitous Computing.Google Scholar
Zhou, Guangyou, Xie, Zhiwen, Huang, Jimmy Xiangji, and He, Tingting. 2016. Bi-transferring deep neural networks for domain adaptation. Pages 322–332 of: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.Google Scholar
Zhou, Joey Tianyi, Pan, Sinno Jialin, Tsang, Ivor W., and Yan, Yan. 2014a. Hybrid heterogeneous transfer learning through deep learning. Pages 2213–2219 of: Proceedings of the 28th AAAI Conference on Artificial Intelligence.Google Scholar
Zhou, Joey Tianyi, Tsang, Ivor W., Pan, Sinno Jialin, and Tan, Mingkui. 2014b. Heterogeneous domain adaptation for multiple classes. Pages 1095–1103 of: Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics.Google Scholar
Zhu, Fan, Shao, Ling, and Yu, Mengyang. 2014. Cross-modality submodular dictionary learning for information retrieval. Pages 1479–1488 of: Proceedings of the 23rd ACM International Conference on Information and Knowledge Management.Google Scholar
Zhu, Feng, Wang, Yan, Chen, Chaochao, et al. 2018. A deep framework for cross-domain and cross-system recommendations. Pages 3711–3717 of: Proceedings of the 27th International Joint Conference on Artificial Intelligence.Google Scholar
Zhu, Jun-Yan, Park, Taesung, Isola, Phillip, and Efros, Alexei A. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. Pages 2223–2232 of: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Zhu, Xiaojin. 2005. Semi-supervised Learning Literature Survey, Tech. Report, Computer Sciences TR 1530, University of Wisconsin-Madison.Google Scholar
Zhu, Yin, Chen, Yuqiang, Lu, Zhongqi, et al. 2011. Heterogeneous transfer learning for image classification. Proceedings of the 25th AAAI Conference on Artificial Intelligence.Google Scholar
Zhuang, Yue Ting, Wang, Yan Fei, Wu, Fei, Zhang, Yin, and Lu, Weiming. 2013. Supervised coupled dictionary learning with group structures for multi-modal retrieval. Proceedings of the 27th AAAI Conference on Artificial Intelligence.Google Scholar
Zhuo, Hankz Hankui, and Yang, Qiang. 2014. Action-model acquisition for planning via transfer learning. Artificial Intelligence, 212, 80103.Google Scholar
Zilka, Lukas, and Jurcicek, Filip. 2015. Incremental LSTM-based dialog state tracker. Pages 757–762 of: Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding.Google Scholar
Ziser, Yftah, and Reichart, Roi. 2017. Neural structural correspondence learning for domain adaptation. Pages 400–410 of: Proceedings of the 21st Conference on Computational Natural Language Learning.Google Scholar
Ziser, Yftah, and Reichart, Roi. 2018. Pivot based language modeling for improved neural domain adapation. Pages 1241–1251 of: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.Google Scholar
Zoph, Barret, and Knight, Kevin. 2016. Multi-source neural translation. Pages 30–34 of: Proceedings of The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.Google Scholar
Zoph, Barret, Yuret, Deniz, May, Jonathan, and Knight, Kevin. 2016. Transfer learning for low-resource neural machine translation. CoRR, abs/1604.02201.Google Scholar
Zweig, Alon, and Weinshall, Daphna. 2013. Hierarchical regularization cascade for joint learning. Pages 37–45 of: Proceedings of the 30th International Conference on Machine Learning.Google Scholar

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×